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GENES CONTROLLING FT, OR AT DEVELOPMENT ANn 
APICAL DOMTNANPF. IN n.lfflTft 

TECHNICAL FfRl.n 

5 This invention is related to compositions and methods for affecting plant floral development 

and the timing of the transition from vegetative to reproductive growth. 

BACKGROUND ART 
Floral initiation is controlled by several factors including photoperiod, cold treatment, 
10 hormones, and nutrients (Coen, Plant Mol. Biol. 42:241-279, 1991; Gasser, Annu. Rev. Plant 
Physiol. Plant Mol. Biol. 42:621-649, 1991). Physiological studies have demonstrated that 
vegetative tissues are the site for signal perception and for generation of chemicals that cause the 
transition from vegetative growth to flowering (Lang, in: Encyclopedia of Plant Physiology, vol. 
15, Berlin, ed., Springer- Veriag, pp. 1371-1536, 1965; Zeevaartm, in: Light and the Flowering 
15 Process, Vince-Prue et al., eds., Orlando Academic Press, pp. 137-142, 1984). Genetic analysis 
revealed that there are several types of mutants that alter flowering time. In Arabidopsis thaliana, 
there are at least two mutant groups based on their response to photoperiod and vernalization 
(Martinez-Zapater et al., in: Arabidopsis, Meyerowitz and Somerville, eds., Plainview, N.Y., 
Cold Spring Harbor Laboratory, pp. 403-433, 1994). These phenotypes suggest that there are 
20 multiple pathways that lead to flowering. 

Study on mutants that interfere with normal flower development has provided some 
information on controlling the mechanisms of the development. This has led to the knowledge that 
there are at least two genes needed for induction of flower development: LEAFY (LFY) and 
APETALAI (API) genes in Arabidopsis (Weigel, Annu. Rev. Genet. 29:19-39, 1995), and 
25 FLORICAULA (FLO) and SQUAMOSA (SQUA) genes in Antirrhinum majus (Bradley et al., Cell 
72:85-95, 1993). Cloning and analysis of these genes revealed that the LFY and FLO genes are 
homologs and encode proteins that each contain a proline-rich region at the N-tenninus and a highly 
acidic central region, which are features of certain types of transcription factors that contain a 
conserved MADS-box sequence (Huijser et al., EMBO J. 11:1239-1249, 1992; Mandel et al., 
30 Nature 360:273-277, 1992). MADS box-containing genes were isolated from several plant species 
and are known to play important roles in plant development, especially flower development. 
Arabidopsis homeotic genes - AGAMOUS (AG), PISTILATA (PI), and APETALA3 (AP3) - are 
members of the MADS box gene family (Yanofsky et al., Nature 346:35-39, 1990; Goto and 
Meyerowitz, Genes Devel. 8:1548-1560, 1994; Jacket al., Cell 68:683-697, 1992). Similar 
35 homeotic genes from A. majus, namely PLENA (PLE), GLOBOSA (GLO), and DEFICIENS A 

(DEFA), are also MADS box genes (Bradley et al., Cell 72:85-95, 1993; Trobner et al., EMBO J. 
11:4693-4704, 1992; Sommer et al., EMBO J. 9:605-613, 1990). Characterization of these gene 
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products showed that the conserved MADS box domain is for sequence-specific DNA binding, 
dimerization, and attraction of secondary factors (Pellegrini et al., Nature 376:490-498, 1995). 
The DNA sequence with which the MADS box domains interact is the consensus finding site, 
CCA/T 6 GG (Pollock and Treisman, Genes Dev. 5:2327-2341, 1991; Huang et al., Nucl. Acids 
5 Res. 21 :4769-4776, 1993). In addition to the MADS-box domain, the plant MADS box proteins 
include the K-box domain, a second conserved region carrying 65-70 amino acid residues. The K- 
box domain was named due to the structural resemblance to the coiled coil domain of keratin (Ma 
et al., Genes Dev. 5:484-495, 1991) and has been suggested to be related to protein-protein 
interactions (Pnueli et al., Plant J. 1:255-266, 1991). Similar MADS-box genes have also been 

10 studied in other plants including tomato, rape, tobacco, petunia, maize, and rice (TheiBen and 
Saedler, Curr. Opin. Genet. Dev. 5:628-639, 1995). A number of plant MADS box genes that 
deviate from the functions of the typical meristem identity and organ identity genes have been 
identified. These genes are involved in the control of ovule development (Angenent et al., Plant 
Cell 7:1569-1582, 1995), vegetative growth (Mandel et al.. Plant MoL Biol. 25:319-321, 1994), 

15 root development (Rounseley et al., Plant Cell 7:1259-1269, 1995), embryogenesis (Heck et al., 
Plant Cell 7:1271-1282, 1995), or symbiotic induction (Heard and Dunn, Proc. Natl. Acad. Sci. 
USA 5273-5277, 1995). 

There are a large number of MADS box genes in each plant species. In maize, a multigene 
family consists of at least 50 different MADS box genes and these genes are dispersed throughout 

20 the plant genome (Mena et ai., Plant J. 8:845-854, 1995; Fischer et al., Proc. Nad. Acad. Sci. 
USA 92:5331-5335, 1995). The MADS box multigene family can be divided into several 
subfamilies according to their primary sequences, expression patterns, and functions (TheiBen and 
Saedler, Curr. Opin. Genet. Dev. 5:628-639, 1995). 

Several MADS genes from Oryza sativa have been identified and sequenced, including 

25 OsMADSl (Chung et al.. Plant Mol. Bio. 26:657-665, 1994; Chung et al., Plant Sci. 109:45-56 
1995; WO 96/11566) and OsMADSS (Kang et al.. Plant Mol. Biol. 29:1-10, 1997). The present 
invention is directed to several other distinct MADS genes of Oryza sativa. 

The timing of the transition from vegetative growth to flowering is one of the most 
important steps in plant development. This step determines the quality and quantity of most crop 

30 species by affecting the balance between vegetative and reproductive growth. It would therefore be 
highly desirable to have means to affect the timing of this transition. The present invention meets 
these needs and others. 



35 



SUMMARY OF THE INVENTION 
The present invention provides, intra alia, compositions and methods related to the 
OsMADS6 t OsMADS7, and OsMADS8 genes of Oryza sativa and alleles and homologs of such 
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genes. Expression of such genes in transgenic plants causes an altered phenotype, including 
phenotypes related to the timing of the transition between vegetative and reproductive growth. 

It is an object of the invention to provide isolated nucleic acids representing MADS genes 
and alleles that, when expressed in transgenic plants, confer on the plants at least one phenotype 
5 including (1) diminished apical dominance, (2) early flowering, (3) a partially or completely altered 
daylength requirement for flowering, (4) greater synchronization of flowering, or (5) a relaxed " 
vernalization requirement. 

It is another object of the invention to provide isolated nucleic acids comprising (1) a 
sequence of at least 30 contiguous nucleotides of OsMADS6 (SEQ ID NO:2) or an allele or 

10 homolog thereof, or (2) a sequence of at least 100 contiguous nucleotides that has at least 70% 

nucleotide sequence similarity with OsMADSS. When expressed in a transgenic plant, such nucleic 
acids produce at least one phenotype including (1) diminished apical dominance, (2) early 
flowering, (3) a partially or completely altered daylength requirement for flowering, (4) greater 
synchronization of flowering, or (5) a relaxed vernalization requirement. Preferably, such isolated 

15 nucleic acids comprise only silent or conservative substitutions to a native (wild-type) gene 
sequence. 

A further object of the invention is to provide transgenic plants comprising such nucleic 

acids. 

A further object of the invention is to provide probes and primers comprising a fragment of 
20 the OsMADS6 gene, the probes and primers being capable of specifically hybridizing under 

stringent conditions to the OsMADS6 gene. Such probes and primers are useful, for example, for 
obtaining homologs of the OsMADS6 gene from plants other than rice. 

It is a further object of the invention to use the nucleic acids described above to produce 
transgenic plants having altered phenotypes, specifically, to introduce such nucleic acids into plant 
25 cells, thereby producing a transformed plant cell, and to regenerate from the transformed plant cell 
a transgenic plant comprising the nucleic acid. 

The foregoing and other objects and advantages of the invention will become more apparent 
from the following detailed description and accompanying drawings. 

30 BRIEF DESCRIPTION OF the PRAVWr<f 

FIG. 1 shows nucleotide and deduced amino-acid sequences of the OsMADS6 cDNA (SEQ 
ID NO:2 and SEQ ID NO:3). MADS-box and K-box regions are underlined. The positions of 
nucleotides and amino acids are shown on the left and right, respectively. The double underlined 
sequence is the Pstl site, which was used to generate the gene-specific probe of the 360 bp 
35 fragment located at the 3 ' region of the OsMADS6 cDNA. 

FIG. 2 shows nucleotide and deduced amino-acid sequences of the OsMADS7 cDNA (SEQ 
IDNO:4andSEQIDNO:5). The MADS-box and K-box regions are underlined. The positions of 
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micleotides and amino acids are shown on the left and right, respectively. The double-underlined 
sequence is the Pstl site, which was used to generate the gene-specific probe of the 280 bp 
fragment located at the 3'-end region of the OsMADS7 cDN A. 

FIG. 3 shows nucleotide and deduced amino-acid sequences of the OsMADS8 cDNA (SEQ 
5 ID NO:6 and SEQ ID NO:7). The MADS-box and K-box regions are underlined. The positions of 
nucleotides and amino acids are shown on the left and right, respectively. The double-underlined 
sequence is the Nhel site, which was used to generate the gene-specific probe of the 230-bp 
fragment located at 3'-end region of the OsMADS8 cDNA. 

FIGS. 4A-4C show alignments of MADS-box, K-box, and C-tenninal regions of 
10 OsMADS6, OsMADS7, and OsMADS8 proteins (SEQ ID NO:3, 5, and 7, respectively) with other 
MADS-box proteins. Gaps were introduced for optimal alignment. FIG. 4A shows alignments of 
MADS-box regions. FIG. 4B shows alignment of K-box regions. FIG. 4C shows alignment of C- 
terminal regions. 1, OsMADS6 (rice) (SEQ ID NO:3: MADS box, K box, and C-terminal end]; 
2, ZAG3 (maize) (SEQ ID NO:8, MADS box; SEQ ID NO:20, K box; SEQ ID NO:32, C- 
15 terminal end); 3, ZAG5 (maize) (SEQ ID NO:9, MADS box; SEQ ID NO:21, K box; SEQ ID 

NO:33, C-terminal end); 4, AGL6 (Arabidopsis) (SEQ ID NO: 10, MADS box; SEQ ID NO:22, K 
box; SEQ ID NO:34, C-terminal end); 5, OsMADS8 (rice) (SEQ ID NO:7, MADS box, K box, 
and C-terminal end); 6, OsMADS7 (rice) (SEQ ID NO:5, MADS box, K box, and C-terminal 
end); 7, FBP2 (petunia) (SEQ ID NO: 11, MADS box; SEQ ID NO:23, K box; SEQ ID NO:35, C- 
20 terminal end); 8, TM5 (tomato) (SEQ ID NO: 12, MADS box; SEQ ID NO:24, K box; SEQ ID 
NO:36, C-tenninal end); 9, OM1 (orchid) (SEQ ID NO: 13, MADS box; SEQ ID NO:25, K box; 
SEQ ID NO:37, C-terminal end); 10, AGL2 (Arabidopsis) (SEQ ID NO: 14, MADS box; SEQ ID 
NO:26, K box; SEQ ID NO:38, C-tenninal end); 11, AGL4 (Arabidopsis) (SEQ ID NO: 15, 
MADS box; SEQ ID NO:27, K box; SEQ ID NO:39, C-terminal end); 12, OsMADSl (rice) (SEQ 
25 ID NO:l, MADS box, K box, and C-terminal end); 13, API (Arabidopsis) (SEQ ID NO: 16, 

MADS box; SEQ ID NO:28, K box; SEQ ID NO:40, C-terminal end); 14, AG (Arabidopsis) (SEQ 
ID NO: 17, MADS box; SEQ ID NO:29, K box; SEQ ID NO:41, C-tenninal end); 15, AP3 
(Arabidopsis) (SEQ ID NO: 18, MADS box; SEQ ID NO:30, K box; SEQ ID NO:42, C-terminal 
end); 16, PI (Arabidopsis) (SEQ ID NO: 19, MADS box; SEQ ID NO:31, K box; SEQ ID NO:43, 
30 C-terminal end). 

FIG. 5 shows genetic maps of the OsMADS genes. The locations of OsMADS genes along 
with RFLP markers (RG, G), cDNA markers (RZ and C), and microsatellite markers (RM) are 
indicated. Map distance is given in cM on the left of each chromosome. Dark bars represent the 
centromere regions. 



35 
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DETAILED DESCRfPTTOM OF THE INVENTTOM 
Three different MADS-box genes have been isolated from rice, OsMADS6, OsMADS7 % and 
OsMADS8. OsMADS6-8 were isolated from cDNA libraries under moderate stringency 
hybridization conditions using OsMADSl as a probe, as described below. Details regarding the 
5 isolation, characterization, and nucleotide sequence are provided in (Chung et al., Plant Mol. Bio. 
26:657-665, 1994; Chung et al. f Plant Sci. 109:45-56, 1997; WO 96/11566) and OsMADSS (Kang 
et al., Plant Mol. Biol. 29:1-10, 1997). 

The present invention provides compositions and methods related to the MADS-box genes of 
rice, OsMADSS, OsMADSl, OsMADSS, and their alleles and homologs (collectively referred to 
10 below as "MADS genes"). These genes are useful, for example, for producing dwarf plants and 

for affecting the timing of the transition from vegetative to reproductive growth in a wide variety of 
plants, including various dicotyledonous and monocotyledonous crop plants and tree species (see 
Schwarz-Sommer et al., Science 250:931-936, 1990, regarding "MADS-box" genes). 
Usg Of the Genes and their Alleles and Hom ologs for Crop Improvement 

15 The MADS genes and polypeptides disclosed herein are useful for the following purposes, 

among others. 

Early flowering. The timing of the transition between vegetative and reproductive growth is 
an important agronomic trait, serving as a crucial factor in aetermining crop yields. Expression of 
MADS genes in transgenic plants makes it possible to affect the transition from vegetative to 
20 reproductive growth in a wide variety of plants, whether the plants are long-day, short-day, or day- 
neutral plants. 

When MADS genes are expressed in transgenic plants of day-neutral species, the resulting 
transgenic plants flower earlier than control plants. Transgenic long-day and short-day flowering 
plants expressing the MADS genes also flower earlier under permissive conditions than control 
25 plants. The compositions and methods according to the present invention therefore permit one to 
reduce the length of the vegetative growth stage of cereal, fruit, vegetable, floricultural, and other 
crop species. 

Producing dwarf plant varieties- Although it has been possible to enhance the harvest index 
in grain crops by the use of dwarfing genes, the isolation of these genes producing dwarf 
30 phenotypes has been difficult. 

Transgenic plants comprising a MADS transgene are shorter than controls. Therefore, 
MADS genes are useful for producing dwarf plant varieties for a variety of plants including cereal, 
fruit, and. floricultural species. 

Synchronizing reproductive growth Transgenic plants expressing an transgene flower 
35 more synchronously than controls. Therefore, the gene can be used for crops for which 

synchronized harvesting is economically beneficial, allowing more effective use of mechanized 



WO 98/54328 PCT/US98/1 1278 

-6- 

harvesting of fruit species or the production of floricultural species having improved flower quality, 
for example. 

Producing dav-neutral plant varieties. Expression of an transgene in daylength-sensitive 
(i.e., long-day or short-day) plants at least partially overrides the photoperiod requirement for 
5 flowering and can completely override the photoperiod requirement. By introducing such a 

transgene into a wide variety of photoperiod-sensitive crop or floricultural species, including, but 
not limited to rice, soybeans, chrysanthemums and orchids, these plants effectively become day- 
neutral, permitting multiple crops to be grown per year. For example, flowers can be induced 
year-round by introducing the transgene into floricultural species such as chrysanthemum and 
10 orchid. 

Delaying flQWgring acd fruiting. By suppressing the expression of a native MADS gene by 
conventional approaches (e.g., by employing antisense, co-suppression, gene replacement, or other 
conventional approaches to suppressing plant gene expression), flowering and fruiting can be 
delayed. Delayed reproductive growth can thereby increase the length of the vegetative growth 

15 stage and cause the plants to grow faster since the energy used for development of flowers and 
seeds can be saved for vegetative growth. Thus, delaying or eliminating reproductive growth 
results in a higher yield of vegetable species such as spinach, radish, cabbage, or tree species. In 
addition, such plants will be more desirable as garden and street species, since their production of 
pollen allergens can be reduced or eliminated. 

20 Overcoming the vernalization requirement. MADS genes are useful for overriding the 

vernalization requirement of certain plant species. Induction of flowering of transgenic plants that 
constitutively express a MADS gene thus becomes insensitive to temperature. 

Growing plants in space. Plants grown extraterrestriaUy are preferably insensitive to 
photoperiod and temperature for flowering. Transgenic plants carrying MADS genes would be 

25 expected to flower in the extremely abnormal growth conditions found in a space shuttle or space 
station. 

Cloning and analysis of alleles and homologs. The availability of MADS genes makes it 
possible to obtain alleles and homologs of these genes by conventional methods, through the use of 
nucleic acid and antibody probes and primers, as discussed below. 

30 

DEFINITIONS AND METHODS 

The following definitions and methods are provided to better define the present invention 
and to guide those of ordinary skill in the art in the practice of the present invention. Definitions of 
common terms in molecular biology may also be found in Rieger et ai., Glossary of Genetics: 
35 Classical and Molecular, 5th edition, Springer- Verlag: New York, 1991; and Lewin, Genes VI, 
Oxford University Press: New York, 1997. 



WO 98/54328 



-7- 



PCT/US98/11278 



10 



The term "plant" encompasses any plant and progeny thereof. The term also encompasses 
parts of plants, including seed, cuttings, tubers, fruit, flowers, etc. 

A "reproductive unit" of a plant is any totipotent part or tissue of the plant from which one 
can obtain a progeny of the plant, including, for example, seeds, cuttings, buds, bulbs, somatic 
embryos, etc. 

-Natural photoperiod conditions" are photoperiod (i.e., daylength) conditions as provided by 
sunlight at a given location, whether under field conditions. A photoperiod provided by artificial 
lighting but having a daylength approximating that of sunlight would also be considered a narural 
photoperiod condition. 



Nucleic Acids 

Nucleic acids useful in the practice of the present invention comprise at least one of the 
isolated genes disclosed herein, namely OsMADS6, OsMADS7 f and OsMADS8, and their alleles, 
homologs, fragments, and variant forms thereof. 
15 The term "MADS gene" for example, refers to a plant gene that contains a MADS-box 

sequence, and preferably also a K-box sequence, and that is associated with one or more of the 
following phenotypes when expressed as a transgene in transgenic plants: (1) diminished apical 
dominance (as shown, for example, by dwarf stature) and (2) early flowering, and can also be 
associated with, for example, (3) altered daylength requirement for flowering; (4) greater 
20 synchronization of flowering; and (5) relaxed vernalization requirement. The MADS gene 

encompasses the respective coding sequences and genomic sequences flanking the coding sequence 
that are operably linked to the coding sequence, including regulatory elements and/or intron 
sequences. Also encompassed are alleles and homologs. 

The term "native" refers to a naturally-occurring nucleic acid or polypeptide, including a 
25 wild-type sequence and an allele thereof. 

A "homolog" of a MADS gene is a native gene sequence isolated from a plant species other 
than the species from which the MADS gene was originally isolated and having similar biologically 
activities, e.g., dwarfism and early flowering. 

"Iso l ated" . An "isolated" nucleic acid has been substantially separated or purified away 
30 from other nucleic acid sequences in the cell of the organism in which the nucleic acid naturally 
occurs, i.e., other chromosomal and extrachromosomal DNA and RNA. The term "isolated" thus 
encompasses nucleic acids purified by standard nucleic acid-purification methods. The term also 
embraces nucleic acids prepared by recombinant expression in a host cell as well as chemically 
synthesized nucleic acids. 

35 DNA constructs incorporating a MADS gene or fragment thereof preferably place the 

protein-coding sequence under the control of an operably linked promoter that is capable of 
expression in a plant cell. Various promoters suitable for expression of heterologous genes in plant 
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cells are known- in the art, including constitutive promoters, e.g. the cauliflower mosaic virus 
(CaMV) 35S promoter, which is expressed in many plant tissues, organ- or tissue-specific 
promoters, and promoters that are inducible by chemicals such as methyl jasminate, salicylic acid, 
or Safener, for example. 

5 Plant tramfnrmarjfln ppd regeneration. In addition to the methods for plant transformation 

and regeneration described in the Examples below for making transgenic plants, other well-known 
methods can be employed. 

Fr flgfflCTHfi. orobes. and primers. A fragment of an OsMADS nucleic acid according to the 
present invention is a portion of the nucleic acid that is less than full-length and comprises at least a 
10 minimum length capable of hybridizing specifically with the corresponding OsMADS nucleic acid 
(or a sequence complementary thereto) under stringent conditions as defined below. The length of 
such a fragment is preferably at least 15 nucleotides in length, more preferably at least 30 
nucleotides, more preferably at least 50 nucleotides, and most preferably at least 100 nucleotides. 
A "probe" comprises an isolated nucleic acid attached to a detectable label or reporter 
15 molecule well known in the art. Typical labels include radioactive isotopes, ligands, 
chemiluminescent agents, and enzymes. 

"Primers" are short nucleic acids, preferably DNA oligonucleotides 15 nucleotides or more 
in length, that can be annealed to a complementary target DNA strand by nucleic acid hybridization 
to form a hybrid between the primer and the target DNA strand, then extended along the target 
20 DNA strand by a polymerase, preferably a DNA polymerase. Primer pairs can be used for 

amplification of a nucleic acid sequence, e.g., by the polymerase chain reaction (PCR) or other 
nucleic-acid amplification methods well known in the art. PCR-primer pairs can be derived from 
the sequence of a nucleic acid according to the present invention, for example, by using computer 
programs intended for that purpose such as Primer (Version 0.5, ® 1991, Whitehead Institute for 
25 Biomedical Research, Cambridge, MA). 

Methods for preparing and using probes and primers are described, for example, in 
Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd ed., vol. 1-3, ed. Sambrook et 
al., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989; Current Protocols in 
Molecular Biology, ed. Ausubel et al., Greene Publishing and Wiley-Interscience, New York, 1987 
30 (with periodic updates); and Innis et al., PCR Protocols: A Guide to Methods and Applications, 
Academic Press: San Diego, 1990. 

Substantial similarity. A first nucleic acid is "substantially similar" to a second nucleic acid 
if, when optimally aligned (with appropriate nucleotide insertions or deletions) with the other 
nucleic acid (or its complementary strand), there is nucleotide sequence identity in at least about 
35 70%-90% of the nucleotide bases, and preferably greater than 90% of the nucleotide bases. 

("Substantial sequence complementarity" requires a similar degree of sequence complementarity.) 
Sequence similarity can be determined by comparing the nucleotide sequences of two nucleic acids 



WO 98/54328 



-9- 



PCT/US98/11278 



using sequence analysis software such as the Sequence Analysis Software Package of the Genetics 
Computer Group, University of Wisconsin Biotechnology Center, Madison, WI. 

Alternatively, two nucleic acids are substantially similar if they hybridize under stringent 
conditions, as defined below. 

5 Opgrab l y l inked - A first nucleic-acid sequence is "operably" linked with a second nucleic- 

acid sequence when the first nucleic-acid sequence is placed in a functional relationship with the * 
second nucleic-acid sequence. For instance, a promoter is operably linked to a coding sequence if 
the promoter affects the transcription or expression of the coding sequence. Generally, operably 
linked DNA sequences are contiguous and, where necessary to join two protein coding regions, in 
10 reading frame. 

"Recombinant*. A "recombinant" nucleic acid is one that has a sequence that is not 
naturally occurring or has a sequence that is made by an artificial combination of two otherwise 
separated segments of sequence. This artificial combination is often accomplished by chemical 
synthesis or, more commonly, by the artificial manipulation of isolated segments of nucleic acids, 
15 e.g., by genetic engineering techniques. 

Techniques for nucleic-acid manipulation are described generally in, for example, Sambrook 
et al. (1989) and Ausubel et al. (1987, with periodic updates). 

Preparation of recombinant or Chemically synthesized nucleic a c ids: vectors reformation 
hQSLseJlfi. Large amounts of a nucleic acid according to the present invention can be produced by 
20 recombinant means well known in the art or by chemical synthesis. 

Natural or synthetic nucleic acids according to the present invention can be incorporated, into 
recombinant nucleic-acid constructs, typically DNA constructs, capable of introduction into and 
replication in a host cell. Usually the DNA constructs will be suitable for replication in a 
unicellular host, such as E. colt or other commonly used bacteria, but can also be introduced into 
25 yeast, mammalian, plant or other eukaryotic cells. 

Preferably, such a nucleic-acid construct is a vector comprising a replication system 
recognized by the host. For the practice of the present invention, well-known compositions and 
techniques for preparing and using vectors, host cells, introduction of vectors into host cells, etc. 
are employed, as discussed, inter alia, in Sambrook et al., 1989, or Ausubel et al., 1987. 
30 A tissue > organ* or organism into which has been introduced a nucleic acid according to 

an embodiment of the present invention, such as a recombinant vector, is considered * transformed" 
or "transgenic. " A recombinant DNA construct that is present in a transgenic host cell, particularly 
a transgenic plant, is referred to as a "transgene." The term "transgenic" or "transformed'' when 
referring to a cell or organism, also includes (1) progeny of the cell or organism and (2) plants 
35 produced from a breeding program employing such a "transgenic" plant as a parent in a cross and 
exhibiting an altered phenotype resulting from the presence of the recombinant DNA construct. 
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Conventional methods for chemical synthesis of nucleic acids are used, for example, in 
Beaucage and Carruthers, Tetra. Letts. 22:1859-1862, 1981, and Matteucci et al., J. Am. Chem. 
Soc. 103:3185, 1981. Chemical synthesis of nucleic acids can be performed, for example, on 
commercial automated oligonucleotide synthesizers. 
5 Nucleic-Acid Hybridization: ■ Strinpem Co nditions": "Specific" . The term "stringent 

conditions" is functionally defined with regard to the hybridization of a nucleic-acid probe to a 
target nucleic acid (i.e., to a particular nucleic-acid sequence of interest) by the hybridization 
procedure discussed in Sambrook et al., 1989 at 9.52-9.55. See also, Sambrook et al., 1989 at 
9.47-9.52, 9.56-9.58; Kanehisa, Nuc. Acids Res. 12:203-213, 1984; and Wetmur and Davidson, J. 

10 Mol. Biol. 31:349-370, 1968. According to one embodiment of the invention, "moderate 

stringency" hybridization conditions are hybridization at 60°C in a hybridization solution including 
6x SSC, 5X Denhardt's reagent, 0.5% SDS, 100 jig/mL denatured, fragmented salmon sperm 
DNA, and the labeled probe (Sambrook et al., 1989), and "high stringency" conditions are 
hybridization at 65-68°C in the same hybridization solution. 

15 Regarding the amplification of a target nucleic- acid sequence (e.g., by PCR) using a 

particular amplification primer pair, stringent conditions are conditions that permit the primer pair 
to hybridize only to the target nucieic-acid sequence to which a primer having the corresponding 
wild-type sequence (or its complement) would bind. 

Nucleic-acid hybridization is affected by such conditions as salt concentration, temperature, 

20 or organic solvents, in addition to the base composition, length of the complementary strands, and 
the number of mismatched bases between the hybridizing nucleic acids. 

When referring to a probe or primer, the term "specific for (a target sequence)" indicates 
that the probe or primer hybridizes only to the target sequence in a given sample comprising the 
target sequence. 

25 Nwleic-3Cid amplificatioa. As used herein, "amplified DNA" refers to the product of 

nucleic-acid amplification of a target nucleic-acid sequence. Nucleic-acid amplification can be 
accomplished by any of the various nucleic-acid amplification methods known in the art, including 
the polymerase chain reaction (PCR). A variety of amplification methods are known in the art and 
are described, inter alia, in U.S. Patent Nos. 4,683, 195 and 4,683,202 and in PCR Protocols: A 

30 Guide to Methods and Applications. Innis et al. eds.. Academic Press, San Diego, 1990. 

In Situ hybridization- A number of techniques have been developed in which nucleic-acid 
probes are used to locate specific DNA sequences on intact chromosomes in situ, a procedure 
called "in situ hybridization. " See, e.g., Pinkel et al., Proc. Nad. Acad. Sci. USA 85:9138-9142. 
1988 (regarding fluorescence in situ hybridization), and Lengauer et al., Hum. Mol. Genet. 2:505- 

*5 512, 1993 (regarding "chromosomal bar codes"). Well-known methods for in situ hybridization 
and for the preparation of probes or primers for such methods are employed in the practice of the 
present invention, including direct and indirect in situ hybridization methods. 
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Method Of Obtain i ng genomic clones, alleles and homnin r Based upon the availability of 
the nucleotide sequences of the MADS genes disclosed herein, other MADS genes (e.g., alleles and 
homoiogs) and genomic clones corresponding thereto can be readily obtained from a wide variety 
of plants by cloning methods known in the an. 
5 For example, one or more primer pairs can be used to amplify such alleles or homoiogs by 

the polymerase chain reaction (PCR). Alternatively, the disclosed cDNA of OsMADS6, 
OsMADS7, or OsMADS8, or a fragment thereof, can be used to probe a cDNA or genomic library 
made from a given plant species. 

NucleoUde-Sequence and Amino-Acid Scg n ^ Vflri a rU ff "Variant" DNA molecules are 
10 DNA molecules containing minor changes to a native, or wild-type, sequence, i.e., changes in 
which one or more nucleotides of a native sequence are deleted, added, and/or substituted while 
substantially maintaining wild-type biological activity. Variant DNA molecules can be produced, 
for example, by standard DNA mutagenesis techniques or by chemically synthesizing the variant 
DNA molecule. Such variants do not change the reading frame of the protein-coding region of the 
15 nucleic acid. 

Amino-acid substitutions are preferably substitutions of single amino-acid residues. DNA 
insertions are preferably of about 1 to 10 contiguous nucleotides and deletions are preferably of 
about 1 to 30 contiguous nucleotides. Insertions and deletions are preferably insertions or deletions 
from an end of the protein-coding or non-coding sequence (i.e. , a truncation of the native sequence) 

20 and are preferably made in adjacent base pairs. Substitutions, deletions, insertions or any 

combination thereof can be combined to arrive at a final construct. For the sequences disclosed 
herein, amino acid substitutions preferably are located outside sequences that are conserved among 
OsMADS6-8 and homoiogs thereof. 

Preferably, variant nucleic acids according to the present invention are "silent" or 

25 "conservative" variants. "Silent" variants are variants of a native sequence in which there has been 
a substitution of one or more base pairs but no change in the amino-acid sequence of the 
polypeptide encoded by the sequence. "Conservative " variants are variants of a native sequence in 
which at least one codon in the protein-coding region of the native sequence has been changed, 
resulting in a conservative change in one or more amino-acid residues of the polypeptide encoded 

30 by the nucleic-acid sequence, i.e., an amino-acid substitution. A number of conservative amino- 
acid substitutions are listed in Table 1. In addition, there can be a substitution (resulting in a net 
gain or loss) of one or more cysteine residues, thereby affecting disulfide linkages in the encoded 
polypeptide. 
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TABLE 1 
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Original Residue 


Conservative Substitutions 




Ala 


ser 




Are 
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Asn 


ffln his 




Asp 


filu 


10 


Cys - 


ser 




Gin 


asn 




Glu 






Gly 


pro 




His 


asn: eln 


15 


De 


leu, val 




Leu 


tip* val 

llw ( Vol 




Lys 


arc* dn* elu 




Met 


leu; ile 




Phe 


met; leu; tyr 


20 


Ser 


thr 




Thr 


ser 




Trp 


tyr 




Tyr 


trp; phe 




Val 


ile; leu 


25 







Substantial changes in function are made by selecting substitutions that are less conservative 
than those in Table 1, i.e., selecting residues that differ more significantly in their effect on 
mai n tai n i n g: (a) the structure of the polypeptide backbone in the area of the substitution, for 

30 example, as a sheet or helical conformation; (b) the charge or hydrophobicity of the molecule at the 
target site; or (c) the bulk of the side chain. The substitutions which in general are expected to 
produce the greatest changes in protein properties are those in which: (a) a hydrophilic residue, 
e.g., seryl or threonyl, is substituted for (or by) a hydrophobic residue, e.g., leucyl, isoleucyi, 
phenylalanyl, valyl or alanyl; (b) a cysteine or proline is substituted for (or by) any other residue; 

35 (c) a residue having an electropositive side chain, e.g., lysyl, arginyl, or histadyl, is substituted for 
(or by) an electronegative residue, e.g., glutamyl or aspartyl; or (d) a residue having a bulky side 
chain, e.g., phenylalanine, is substituted for (or by) one not having a side chain, e.g., glycine. 

Polypeptides 

40 The term "OsMADS protein" (or "OsMADS polypeptide") refers to a protein encoded by 

an OsMADS gene that has at least about 70% homology with a given native OsMADS polypeptide 
and preferably retains biological activity of the wild-type polypeptide. An OsMADS polypeptide 
can be isolated from a natural source, produced by the expression of a recombinant OsMADS 
nucleic acid, or be chemically synthesized by conventional methods, for example. 

45 Polypeptide sequence homology. Ordinarily, the polypeptides encompassed by the present 

invention are at least about 90% homologous to a native polypeptide, and more preferably at least 
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about 95% homologous. Preferably, such polypeptides also have characteristic structural features 
and biological activity of the native polypeptide. 

Polypeptide homology is typically analyzed using sequence analysis software such as the 
Sequence Analysis Software Package of the Genetics Computer Group, University of Wisconsin 
5 Biotechnology Center, Madison, WI. Polypeptide sequence analysis software matches homologous 
sequences using measures of homology assigned to various substitutions, deletions, substitutions, 
and other modifications. 

"Isolated.<*ftirified." "Homogeneous" Polypeptides. A polypeptide is "isolated" if it has 
been separated from the cellular components (nucleic acids, lipids, carbohydrates, and other 
10 polypeptides) that naturally accompany it. Such a polypeptide can also be referred to as -pure" or 
"homogeneous" or "substantially ' pure or homogeneous. Thus, a polypeptide which is chemically 
synthesized or recombinant (i.e., the product of the expression of a recombinant nucleic acid, even 
if expressed in a homologous cell type) is considered to be isolated. A monomelic polypeptide is 
isolated when at least 60-90% by weight, of a sample is composed of the polypeptide, preferably 
15 95% or more, and more preferably more than 99% . Protein purity or homogeneity is indicated, 
for example, by polyacrylamide gel electrophoresis of a protein sample, followed by visualization 
of a single polypeptide band upon staining the polyacrylamide gel; high pressure liquid 
chromatography; or other methods known in the art. 

Protein purification. The polypeptides of the present invention can be purified by any of the 
20 means known in the art. Various methods of protein purification are described, e.g., in Guide to 
Protein Purification, ed. Deutscher, Meth. Enzymol. 185, Academic Press, San Diego, 1990; and 
Scopes, Protein Purification: Principles and Practice, Springer Verlag, New York, 1982. 

Variant forms Of polypeptides; labeling- Variant polypeptides are those in which there have 
been substitutions, deletions, insertions or other modifications of a native polypeptide sequence. 
25 Variant polypeptides substantially retain structural and/or biological characteristics and are 

preferably silent or conservative substitutions of one or a small number of amino acid residues. 

Wild-type polypeptide sequences can be modified by conventional methods, e.g., by 
acetyiation, carboxylation, phosphorylation, glycosylation, ubiquitination, and labeling, whether 
accomplished by in vivo or in vitro enzymatic treatment of a native polypeptide or by protein 
30 synthesis using modified amino acids. 

Any of a variety of conventional methods and reagents for labeling polypeptides and 
fragments thereof can be employed in the practice of the invention. Typical labels include 
radioactive isotopes, ligands or ligand receptors, fluorophores, chemiluminescent agents, and 
enzymes. Methods for labeling and guidance in the choice of labels appropriate for various 
35 purposes are discussed, e.g., in Molecular Cloning: A Laboratory Manual, 2nd ed., vol. 1-3, ed. 
Sambrook et al., Cold Spring Harbor Laboratory Press: Cold Spring Harbor, NY, 1989; and 
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Current Protocols in Molecular Biology, ed. Ausubel et al., Greene Publishing and Wiley- 
Interscience: New York, 1987 (with periodic updates). 

Polypeptide Fragments. The present invention also encompasses polypeptide fragments that 
lack at least one residue of a native full-length polypeptide yet retain at least one of the biological 
5 activities characteristic of the native polypeptide. For example, the fragment can cause early 

flowering or dwarf phenotypes when expressed as a transgene in a host plant. An immunologically 
active fragment of a given full-length polypeptide is capable of raising antibodies specific for the 
full-length polypeptide in a target imm u n e system (e.g., murine or rabbit) or of competing with the 
full-length polypeptide for binding to such specific antibodies, and is thus useful in immunoassays 

10 for the presence of the native polypeptide in a biological sample. Such immunologically active 
fragments typically have a minimum size of 7 to 17 amino acids. 

Fusion polypeptides. The present invention also provides fusion polypeptides including, for 
example, heterologous fusion polypeptides, e.g., a fusion between an OsMADS polypeptide 
sequence or fragment thereof and a heterologous polypeptide sequence, e.g., a sequence from a 

15 different polypeptide. Such heterologous fusion polypeptides generally exhibit biological properties 
(such as ligand-binding, catalysis, secretion signals, antigenic determinants, etc.) derived from each 
of the fused sequences. Fusion partners include, for example, immunoglobulins, beta 
galactosidase, trpE, protein A, beta lactamase, alpha amylase, alcohol dehydrogenase, yeast alpha 
mating factor, and various signal and leader sequences which, e.g. , can direct the secretion of the 

20 polypeptide. Fusion polypeptides can be made, for example, by the expression of recombinant 
nucleic acids or by chemical synthesis. 

Polypeptide sequence determination. The sequence of a polypeptide can be determined by 
any conventional methods. In order to determine the sequence of a polypeptide, the polypeptide is 
typically fragmented, the fragments separated, and the sequence of each fragment determined. To 

25 obtain fragments of a polypeptide for sequence determination, for example, the polypeptide can be 
digested with an enzyme such as trypsin, clostripain, or Staphylococcus protease, or with chemical 
agents such as cyanogen bromide, o-iodosobenzoate, hydroxylamine or 2-nitro-5- 
thiocyanobenzoate. Peptide fragments can be separated, e.g., by reversed-phase high-performance 
liquid chromatography (HPLC) and analyzed by gas-phase sequencing, for example. 

30 

Antibodies 

The present invention also encompasses polyclonal and/or monoclonal antibodies capable of 
specifically binding to any of the polypeptides disclosed herein. Such antibodies can be produced 
by any conventional method. "Specific'* antibodies are capable of distinguishing a given 
35 polypeptide from other polypeptides in a sample. Specific antibodies are useful, for example in 
purifying a polypeptide from a biological sample; in cloning alleles or homologs of a given gene 
sequence from an expression library; as antibody probes for protein blots and immunoassays; etc. 
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For the preparation and use of antibodies according to the present invention, including 
various antibody labelling and immunoassay techniques and applications, see, e.g., Goding, 
Monoclonal Antibodies: Principles and Practice, 2d ed, Academic Press, New York. 1986; and 
Harlow and Lane, Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring 
Harbor, NY, 1988. Suitable labels for antibodies include radionuclides, enzymes, substrates, 
cofactors, inhibitors, fluorescent agents, chemUuminescent agents, magnetic particles and the like. 

The invention will be better understood by reference to the following Examples, which are 
intended to merely illustrate the best mode now known for practicing the invention. The scope of 
the invention is not to be considered limited thereto, however. 

EXAMPLES 

EXAMPLE i; Isolation and OiaracTPrizarion of Three Rice mads-r q , r*™* tw r m rA i rh f . 

Timing of Flowering -OgMAlVX OxMAHS7 a „H Qi MAPM 
Experimental procedures 

15 Bacterial Strains. Plant Materials, and Plant Transformation Escherichia cotfMClOOO 

(ara, leu, lac, gal, str) was used as the recipient for routine cloning experiments. Rice (Oryza 
sativa L. cv. M201) plants were grown in a growth chamber at 26°C with 10.5-hr day cycle. 

Bacteria l Strains . Plant materials, and nlant transformation Escherichia coli JM 83 was 
used as the recipient for routine cloning experiments. Agrobacterium tumefaciens LBA4404 
20 (Hoekema et al.. Nature 303: 179-181 1983) containing the Ach5 chromosomal background and a 
disarmed helper-Ti plasmid pAL4404 was used for transformation of tobacco plants (N. tabacum L. 
cv. Xanthi) by the cocultivation method (An et al., in: Plant Molecular Biology Manual, Gelvin 
and Schilperoort, eds., Kluwer Academic, Dordrecht, Belgium, pp. A3/1-19, 1988). Transgenic 
tobacco plants were maintained under greenhouse conditions. Rice (Oryza sativa L. cv. M201) 
25 plants were grown in a growth chamber at 29°C with a 10.5 h day cycle. 

Li brary screening and sequence analysis cDNA libraries were constructed using A. ZapII 
vector (Stratagene) and mRNA was prepared from rice flowers at floral primordial stage when the 
length of the panicles was below 1 cm. Hybridization was performed with 10 5 plaques using a n P- 
labeled probe of the OsMADSl coding region. The cDNA insert was rescued in vivo using an f 1 
30 helper phage. R408 (Stratagene). Both strands of the cDNA were sequenced by the dideoxy- 

nucleotide chain termination method using a double-strand DNA as a template (Sanger et al. Proc. 
NaU. Acad. Sci. USA 74:5463-5467, 1977). Protein-sequence similarity was analyzed by the IG 
Suite software package (Intelligenetics Co., Mountain View, CA) and the NCBI non-redundant 
protein database on the international network. 

PNA and RNA b l Qt a nalyse* . Genomic DNA was isolated by the cetyltrimethylammonium 
bromide (CTAB) method from two-week-old rice seedlings grown hydroponically (Rogers and 
Bendich, Plant Molecular Biology Manual Kluwer Academic, Dordrecht, Belguim, pp. A6/1- 



35 
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101988). Eight fig of genomic DNA was digested with the appropriate restriction enzymes, 
separated on a 0.7% agarose gel, blotted onto a nylon membrane, and hybridized with a M P-labeled 
probe for 16 h at 65°C. followed by a wash with a solution containing 2X SSC and 0.5% SDS for 
. 20 min at 65°C t followed by a wash with a solution of 0.1X SSC and 0. 1 % SDS for 15 min at the 
5 same temperature. Total RNA was isolated by the guanidium thiocyanate method (Sambrook et 
al., 1989). Leaf and root samples were harvested from the two-week-old seedlings. Floral organ 
samples were obtained by dissecting late vacuolated-stage flowers under a dissecting microscope. 
Twenty-five fig of total RNA was fractionated on a 1.3% agarose gel as described previously 
(Sambrook et al., 1989). After RNA transfer onto a nylon membrane, the resulting blot was 

10 hybridized in a solution containing 0.5 M NaP0 4 (pH 7.2), 1 mM EDTA, 1 % BSA, and 7% SDS 
for 20 h at 60°C (Church and Gilbert, Proc. Natl. Acad. Sci. USA 81:1191-1195, 1984). After 
hybridization, the blot was washed twice with a solution containing 0. IX SSPE and 0. 1 % SDS for 
5 min at room temperature followed by two washes of the same solution at 60°C for 15 min. 

Mapping procedures. An Fll recombinant inbred population consisting of 164 lines derived 

15 from a cross between Milyang 23 and Gihobyeo was used to construct a molecular map. Three- 
week old leaf tissue was harvested from over one hundred seedlings for each Fll line and bulked 
for DNA extraction. DNA was digested with restriction enzymes (BomHI, Dral, EcoRI, Hindlll, 
EcoRV, Seal, Xbal, Kpifi) and 8 ^g per lane was used to make mapping filters. DNA blotting and 
hybridization were performed as described above. Linkage analysis was performed using 

20 Mapmaker Version 3.0 (Lander et al.. Genomics 12:174-181, 1987) on a Macintosh Power PC 
8100/80AV. Map units (cM) were derived using the Kosambi function (Kosambi, Ann. Eugen. 
12:172-175, 1994). 



Results 

25 Isolation of rice cDNA clones encoding MADS box protein Three cDNA clones were 

isolated by screening a A. ZapU cDNA library that was prepared from rice floral primordia using 
OsMADSl cDNA as a probe (described above). These clones were designated OsMADS6, 
OsMADS7, and OsMADS8. DNA sequence analysis showed that these clones are 1180 bp to 1259 
bp long and encode putative proteins 248 to 250 amino acid residues long (OsMADS6: FIG. 1 , SEQ 

30 ID NO:2 and SEQ ED NO:3; OsMADS7: FIG. 2, SEQ ID NO:4 and SEQ ID NO:5; OsMADS8\ 
FIG. 3, SEQ ID NO:6 and SEQ ID NO:7). The 5* -untranslated region of the OsMADS8 cDNA 
contains eight repeats of the GGA sequence and the 5* -untranslated region of OsMADSl cDNA 
contains six repeats of the GGT sequence, a so-called microsateilite (Browne and Litt, Nucl. Acids 
Res. 20:141, 1991; Stalings, Genomics 17:890-891, 1992). Such repeat sequences have been 

35 observed in other rice MADS-box genes (Chung et al., Plant Mol. Biol. 26:657-665, 1994). 

The MADS-box domain of the cDNA clone is located between the 2nd and 57th amino 
acids of each protein (FIG. 1 and SEQ ID NO:3; FIG. 2 and SEQ ID NO:5; FIG. 3 and SEQ ID 
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NO:7). Comparison with other MADS-box genes shows that this region is the most conserved. A 
second conserved domain, the K box, is located between residues 91 and 156 in OsMADS6 and 
between the residues 95 and 160 in both OsMADS7 and OsMADS8 (FIGS. 1-3). These genes 
contain two variable regions, the I-region between the MADS and K boxes, and the C-region 
downstream of the K box (Purugganan et al. t Genetics 140:345-356, 1995). The structure of the 
proteins encoded by OsMADSS, OsMADSl, and OsMADS8 is therefore typical of the plant MADS- 
box gene family. 

Based on the amino-acid sequence similarity of the entire coding region, the OsMADS6, 
OsMADS7, and OsMADS8 proteins can be grouped into the AGL2 family, which includes AGL2, 
AGL4 and AGL6 of Arabidopsis (Ma et al., Genes Dev. 5:484-495, 1991), ZAG3 and ZAG5 of 
maize (Mena et al., Plant J. 8:845-854, 1995), FBP2 of petunia (Angenent et ah, Plant J. 5:33^4, 
1994), TM5 of tomato (Pnueli et al., Plant Cell 6:175-186, 1994), OM1 of orchid (Lu et al., Plant 
Mol. Biol. 23:901-904, 1993), and OsMADSl of rice. Among these genes, the OsMADS6 protein 
was most homologous to ZAG3 (84% homology) and ZAG5 (82% homology), while the 
OsMADS7 and OsMADS8 proteins were most homologous to OM1 (61% and 65%, respectively) 
and FBP2 (60% and 64% homology, respectively). OsMADS6, OsMADS7, and OsMADS8 
proteins had 50% amino acid sequence homology to OsMADSl. 

Alignment of the OsMADS6, OsMADS7, and OsMADS8 proteins with other members of 
the AGL2 family showed that the MADS-box (FIG. 4A; SEQ ID NOS:l, 3, 5, 7 and 8-15), K-box 
(FIG. 4B; SEQ ID NOS:l, 3, 5, 7 and 20-27), and C-tenninal end regions (FIG. 4C; SEQ ID 
NOS:l, 3, 5, 7 and 32-39) share significant sequence homologies. The MADS-box region of 
OsMADS6 is 100% identical to that of ZAG3 and differs from the MADS-box region of 
OsMADS7 and OsMADS8 in two positions; the 22nd and 50th amino acid serines in OsMADS6 
are replaced with alanine and asparagine, respectively, in both OsMADS7 and OsMADS8. The 
MADS-box sequences of OsMADS6, OsMADS7, and OsMADS8 share at least 89% identity to the 
MADS-box sequences of other AGL2 proteins. The sequence homology in the K-box region is 
lower compared to the MADS-box region, but still significant. These regions of OsMADS6, 
OsMADS7, and OsMADS8 are at least 43% identical to other members of the family, whereas the 
homology was much lower with distandy related MADS-box proteins such as AG, AP3, and PI. 
The sequence homology at the C-terminal end was much lower. However, there are two blocks of 
conserved regions at the end of the proteins, and these AGL2-specific sequences were not found in 
other MADS-box proteins. In general, MADS-box proteins include (beginning from the amino- 
terminus): a MADS-box region, an I region, a K-box region, a C-tenninal region, and a C- 
tenninal end region. 

RNA blot analysis. There are a large number of MADS-box genes in the rice genome. 
Genomic DNA blot analyses were conducted to identify a region that would not cross hybridize 
with other MADS box genes. Rice genomic DNA was digested with EcoRI, Hindlll, or Pstl, 
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fractionated on a 0.8% agarose gel, and hybridized with the gene specific probes located at the 3' 
region of each cDNA. The 300 bp PsthEcoW fragment, which is located at the C-tenninal region 
of OsMADSS, hybridized to single DNA fragments. Likewise, the 280 bp Pstl-EcoKL fragment of 
OsMADS7 and 220 bp Nhel-EcoRl fragment of OsMADSS were shown to be gene-specific regions. 
5 RNA blot analyses of the OsMADSS, OsMADS7 t and OsMADS8 transcripts in rice were 

conducted. Ten jxg of total RNA isolated from roots and leaves of two-week-old seedlings, and 
paieas/lemmas, anthers, and carpels of late vacuolated-stage flowers were hybridized with the gene- 
specific probes. 

The OsMADS6 transcript was detectable primarily in carpels and also weakly in palea and 
10 lemma of late vacuolated pollen-stage flowers. However, the transcript was not detectable in 
anthers or vegetative organs. This expression pattern is similar to that of OsMADSL Spatial 
expression patterns of the OsMADS7 and OsMADS8 clones were different from that of OsMADSS, 
Transcripts of both clones were detectable primarily in carpels and also weakly in anthers. This 
expression pattern is similar to those of OsMADS3 and OsMADS4 (Chung et al., Plant Science 
15 109:45-56, 1995; Kang et al., Plant Mol. Biol. 29:1-10, 1995). 

The temporal expression pattern of OsMADS genes during flower development was also 
examined. Twenty-five fig of total RNA isolated from rice flowers at three different developmental 
stages were used for detection of the OsMADS gene expression. The stages examined were: young 
flowers at the panicle size (1 to 5 cm); flowers at the early vacuolated pollen stage; and flowers at 
20 the late vacuolated pollen stage. Ethidium bromide staining of 25S and 17S rRNAs were shown to 
demonstrate equal amounts of RNA loading. During flower development the OsMADSS and 
OsMADS! genes were strongly expressed at the young flower stage and expression gradually 
decreased as the flower further developed to the mature flower stage. The expression of OsMADS8 
was weak at the young flower stage and expression gradually increased as the flower developed. 
25 Chromosomal mapping of the OsMADS genes . An F u recombinant inbred population of 

rice was used to locate the OsMADS genes on a genetic map. C-terminal DNA fragments that were 
shown to be unique to each OsMADS gene were used. These experiments revealed that OsMADSS 
is located on the long arm of chromosome 2, OsMADS7 on the long arm of chromosome 8, and 
OsMADS8 on the long arm of chromosome 9 (FIG. 5). FIG. 5 indicates the location of two other 
30 MADS genes, OsMADS2 and OsMADS3. OsMADS! is a member of CLOBOSA family and is 
located on the long arm of chromosome 1 (FIG. 5). OsMADS3 is a rice homolog of Arabidopsis 
AGAMOUS and is located on the short arm of chromosome 1 . 

Ectopic expression. The functional roles of the three rice MADS genes were studied using 
tobacco plants as a heterologous expression system. The cDNA clones were placed under the 
35 control of the CaMV 35S promoter and transcript 7 terminator using the binary vector pGA748 (An 
et al., in: Plant Molecular Biology Manual, Gelvin and Schilperoort, eds., Kluwer Academic, 
Dordrecht, Belgium, pp. A3/1-19, 1988). The chimeric molecules were transferred to tobacco 
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10 



15 



20 



25 



30 



plants using a kanamycin-resistance marker and an Agrobacterium-medmted Ti plasmid vector 
system. Ten independent Tl transgenic plants were regenerated to avoid any artifacts. Some of 
the primary transgenic plants were shorter and bloomed earlier than control plants, which were 
transformed with the Ti plasmid vector alone, while others showed normal growth. 

RNA-blot analyses of transgenic plants expressing the OsMADS6, OsMADS7, and 
OsMADSS transcripts were performed in order to investigate the expression level of the transgenes. 
Ten /*g of total RNAs isolated from young leaves of the tobacco transgenic plants were hybridized 
with gene-specific probes. In order to minimize the variation due to the stage of development, 
young leaves at anthesis of the first flower were used for RNA isolation. It was found that plants 
showing the early flowering phenotype expressed higher levels of the transgene compared with 
transgenic plants exhibiting a weak or no early flowering phenotype. The transgenic lines 
OsMADS6-2, -4, -5, and -7 accumulated higher levels of the transgene transcript and flowered 
earlier. Transgenic lines OsMADS7-5, -9, and -10. and OsMADS8-4 and -5 accumulated higher 
levels of the transgene transcript and flowered earlier. 

Transgenic lines (T2 generation) that expressed OsMADS6, OsMADS7, and OsMADSS and 
displayed the most severe phenotypes were selected to examine the inheritance of the 
characteristics. The results showed that the early flowering phenotypes was co-inherited with the 
kanamycin resistance gene to the next generation. The transgenic plant line OsMADS6-7 flowered 
an average of 10 days earlier that control plants and was 30 cm shorter than controls. Similarly, 
both OsMADS7-10 and OsMADS8-5 flowered an average of nine days earlier than control plants 
and were significantly shorter than wild-type control plants. 



Discussion 

The three additional rice MADS-box genes that were isolated are probably involved in 
controlling the timing of flowering. The deduced amino acid sequences of the gene products 
showed a high homology to the AGU family proteins. The homology was extensive, covering the 
entire protein. It was observed that the AGL2 family of proteins could be further divided into 
several subgroups depending on the protein sequence similarity in the K box and the two variable 
regions (Theiflen and Saedler, Curr. Opinion in Genet, and Dev. 5:628-639, 1995). Our results 
(see FIG. 4A-C) show that OsMADS6 belongs to the AGL6 subfamily and OsMADS7 and 
OsMADS8 both belong to the FBP2 subfamily. 

The sequence identity of these genes suggest that they share similar biological function. 
Using the co-suppression approach (Angenent et al., Plant J. 5:33-44, 1994), it was found that 
suppression of FBP2 expression in petunia flowers resulted in aberrant flowers with modified whorl 
35 two, three, and four organs. The flower possessed a green corolla, petaloid stamens, and 

dramatically altered carpel structure. Therefore, FBP2 is apparemly involved in the detennination 
of the central parts of the generative meristem. Using an annsense RNA approach, TM5 has been 
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observed to have similar effects on the development of the three whorls. It is known that 
transgenic plants overexpressing OsMADSl exhibit early flowering and dwarf phenotypes, 
indicating that OsMADSl is involved in controlling the timing of flowering without altering the 
morphology of the floral organs. These observations suggest that the FBP2 and TM5 genes 
function differently than the OsMADSl gene. Interestingly, the length of the OsMADS6, 
OsMADS7, and OsMADS8 proteins is similar to the OsMADSl and API proteins, but much 
longer than the FBP2 and TM5 proteins. Therefore, it is possible that the additional amino acid 
sequences encoded by the OsMADS6S genes are responsible for controlling the timing of 
flowering. 

RNA blot analyses showed that the OsMADS6 y OsMADSl, and OsMADSB genes were 
expressed specifically in flowers, which coincides with the expression of genes of the AGL2 
family. This indicates that the genes of the AGL2 family function primarily during the flower 
development. The expression of the OsMADS genes started at the early stage of the flower 
development and extended into the later stages of flower development, indicating that the 
OsMADS6~8 genes play critical roles during the early stages and continue to function as the flower 
further develops. Such expression patterns were also observed from other AGL2 members, 
including AGL2, AGL4, FBP2, TM5 (Angenent et al., Plant Cell 4:983-993, 1992; Ma et al., 
Genes Dev. 5:484-495, 1991; Pnueli et al., Plant J. 1:255-266, 1991), and OsMADS L However, 
not all members of the AGL2 family are expressed at early stages of development. The OM1 
transcript is detectable only after flower organs have fully developed (Lu et al., Plant Mol. Biol. 
23:901-904, 1993). In mature flowers, the OsMADS6 gene was preferentially expressed in the 
carpels and palea/lemma. Similar expression patterns have been found in OsMADSl, API, and 
SQUA, suggesting a possibility that they belong to a functionally similar group. The FBP2 and 
TM5 genes are expressed in the whorls 2, 3 and 4 (Pnueli et al., Plant Cell 6:175-186, 1994; 
Angenent et al., Plant Cell 4:983-993, 1992). Unlike most dicots, rice flowers contain a single 
perianth, the palea/lemma, which more closely resembles a sepal than a petal. The palea/lemma 
contains chlorophyll and remains attached to mature seeds. Therefore, expression of FBP2 
homologs in dicots is expected to be restricted in sepals and petals. The OsMADSl and OsMADSH 
genes were expressed in the inner two whorls, coinciding with the expected expression pattern. 

OsMADSS, OsMADSl, and OsMADS8 mapped to rice chromosomes 2, 8, and 9, 
respectively. The EF-1 gene, which controls the timing of flowering in rice, is located on 
chromosome 10, and the Se genes, which determine photoperiod sensitivity, are located on 
chromosomes 6 or 7 (Khush and Kinoshita, in Rice Biotechnology, Khush and.Toennesson, eds., 
C.A.B. International and International Rice Research Instirute, pp. 93-106, 1991). Therefore, it is 
evident that none of the early flowering MADS-box genes are linked to previously mapped markers 
that are involved in controlling the timing of flowering. The relationship of OsMADS6 % OsMADSl, 
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and OsMADS8 to other genes involved in the timing of flowering, such as E-l, E-2, E-3, //-l and 
lf-2 t can be resolved when these genes are mapped. 

To elucidate the functions of the rice MADS-box genes, we have generated transgenic 
tobacco plants that express a chimeric fusion between the CaMV 35S promoter and an OSMADS 
cDNA. OsMADS6, OsMADS7, and OsMADS8 genes caused early flowering and dwarf phenotypes 
when strongly expressed in transgenic plants. 

This invention has been detailed both by example and by direct description. It should be 
apparent that one having ordinary skill in the relevant art would be able to surmise equivalents to 
the invention as described in the claims which follow but which would be within the spirit of the 
foregoing description. Those equivalents are to be included within the scope of this invention. 
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(2) INFORMATION FOR SEQ ID NO:l: 
70 (i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 257 amino acid residues 

(B) TYPE: amino acid 
5 (D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SBQ ID NO:l; 

Met Gly Arg Gly Lys Val Glu Leu Lys Arg He Glu Asn Lys He Ser 
10 5 io 15 

Arg Gin Val Thr Phe Ala Lys Arg Arg Asn Gly Leu Leu Lys Lys Ala 
20 25 30 



15 



20 



25 



30 



35 



40 



\5 



50 



Tyr Glu Leu Ser Leu Leu Cys Asp Ala Glu Val Ala Leu He He Phe 
35 40 45 

Ser Gly Arg Gly Arg Leu Phe Glu Phe Ser Ser Ser Ser Cys Met Tyr 
5° 55 60 

Lys Thr Leu Glu Arg Tyr Arg Ser Cys Asn Tyr Asn Ser Gin Asp Ala 
65 7 ° 75 go 

Ala Ala Pro Glu Asn Glu He Asn Tyr Gin Glu Tyr Leu Lys Leu Lys 
85 go 95 

Thr Arg Val Glu Phe Leu Gin Thr Thr Gin Arg Asn He Leu Gly Glu 
"0 105 no 

Leu Ser Met Asp Leu Gly Pro Lys Glu Leu Glu Gin Leu Glu Asn Gin 
115 120 i2S 

lie Glu Val Ser Leu Lys Gin He Arg Ser Arg Lys Asn Gin Ala Leu 
"0 135 i 4 o 

Leu Asp Gin Leu Phe Asp Leu Lys Ser Lys Glu Gin Gin Leu Gin Asp 
145 ISO 155 i6o 

Leu Asn Lys Asp Leu Arg Lys Lys Leu Gin Glu Thr Ser Ala Glu Asn 
165 170 175 

Val Leu His Met Ser Trp Gin Asp Gly Gly Gly His Ser Gly Ser Ser 
180 185 190 

Thr Val Leu Ala Asp Gin Pro His His His Gin Gly Leu Leu His Pro 
195 200 205 

His Pro Asp Gin Gly Asp His Ser Leu Gin He Gly Tyr His His Pro 
210 215 220 

His Ala His His His Gin Ala Tyr Met Asp His Leu Ser Asn Glu Ala 
225 230 235 240 

Ala Asp Met Val Ala His His Pro Asn Glu His He Pro Ser Gly Trp 
245 250 255 

He 



(3) INFORMATION FOR SEQ ID NO: 2: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 1060 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double stranded 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

TACCCGCGGG AATCGTTCGA TCGATCGGGC GAG 33 

5 ATG GGG AGG GGA AGA GTT GAG CTG AAG CGC ATC GAG AAC AAG ATC AAC 81 
Met Gly Arg Gly Arg Val Glu Leu Lys Arg lie Glu Asn Lys He Asn 
5 10 is 

AGG CAG GTC ACC TTC TCC AAG CGC CGC AAC GGC CTC CTC AAG AAG GCC 129 
10 Arg Gin Val Thr Phe Ser Lys Arg Arg Asn Gly Leu Leu Lys Lys Ala 

20 2S 30 

TAC GAG CTG TCC GTT CTC TGC GAC GCC GAG GTC GCG CTC ATC ATC TTC 177 
Tyr Glu Leu Ser Val Leu Cya Asp Ala Glu Val Ala Leu He He Phe 
15 35 40 45 



20 



TCC AGC CGC GGC AAG CTC TAC GAG TTC GGC AGC GCC GGC ATA ACA AAG 225 
Ser Lys Ser Arg Gly Leu Tyr Glu Phe Gly Ser Ala Gly He Thr Lya 
50 55 60 

ACT TTA GAA AGG TAC CAA CAT TGT TGC TAC AAT GCT CAA GAT TCC AAC 273 
Thr Leu Glu Arg Tyr Gin His Cys Cys Tyr Asn Ala Gin Asp Ser Asn 
65 70 75 80 

25 AAT GCA CTT TCT GAA ACT CAG AGT TGG TAC CAT GAA ATG TCA AAG TTG 321 
Asn Ala Leu Ser Glu Thr Gin Ser Trp Tyr His Glu Met Ser Lys Leu 
85 90 95 

AAA GCA AAA TTT GAA GCT TTG CAG CGC ACT CAA AGG CAC TTG CTT GGG 369 
30 Lys Ala Lys Phe Glu Ala Leu Gin Arg Thr Gin Arg His Leu Leu Gly 

100 105 no 

GAG GAT CTT GGA CCA CTC AGC GTC AAA GAA TTG CAG CAG CTG GAG AAA 417 
Glu Asp Leu Gly Pro Leu Ser Val Lys Glu Leu Gin Gin Leu Glu Lys 
35 115 120 12s 



40 



50 



60 



CAG CTT GAA TGT GCA CTA TCA CAG GCG AGA CAG AGA AAG ACG CAA CTG 465 
Gin Leu Glu Cys Ala Leu Ser Gin Ala Arg Gin Arg Lys Thr Gin Leu 
130 135 140 

ATG ATG GAA CAG GTG GAG GAA CTT CGC AGA AAG GAG CGT CAG CTG GGT 513 
Met Met Glu Gin Val Glu Glu Leu Arg Arg Lys Glu Arg Gin Leu Gly 
14 5 150 155 160 



45 GAA ATT AAT AGG CAA CTC AAG CAC AAG CTC GAG GTT GAA GGT TCC ACC 561 

Glu He Asn Arg Gin Leu Lys His Lys Leu Glu Val Glu Gly Ser Thr 
165 170 175 



AGC AAC TAC AGA GCC ATG CAG CAA GCC TCC TGG GCT CAG GGC GCC GTG 609 
Ser Asn Tyr Arg Ala Met Gin Gin Ala Ser Trp Ala Gin Gly Ala Val 
180 185 190 



GTG GAG AAT GGC GCC GCA TAC GTG CAG CCG CCG CCA CAC TCC GCG GCC 657 
Val Glu Asn Gly Ala Ala Tyr Val Gin Pro Pro Pro His Ser Ala Ala 
55 195 200 205 



ATG GAC TCT GAA CCC ACC TTG CAA ATT GGG TAT CCT CAT CAA TTT GTG 705 
Met Asp Ser Glu Pro Thr Leu Gin He Gly Tyr Pro His Gin Phe Val 
210 215 220 

CCT GCT GAA GCA AAC ACT ATT CAG AGG AGC ACT GCC CCT GCA GGT GCA 753 
Pro Ala Glu Ala Asn Thr He Gin Arg Ser Thr Ala Pro Ala Gly Ala 
225 230 235 240 



65 GAG AAC AAC TTC ATG CTG GGA TGG GTT CTT TGA 

Glu Asn Asn Phe Met Leu Gly Trp Val Leu 
245 250 



786 



70 



GCTAAGCAGC CATCGATCAG CTGTCAGAAG TTGGAGCTAA TAATAAAAGG GATGTGGAGT 846 
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GGGCTACATG TATCTCGGAT CTCTCTGCGA GCCACCTAAT GGTCTTGCGT GGCCCTTTAA 906 
TCTGTATGTT TTTGTGTGTA AGCTACTGCT AGCTGTTTGC ACCTTCTGCG TCCGTGGTTG 1026 
TGTTTCCGTG CTACCTTTTT ATGTTTTGAT TTGGATCTTG TTTGAAAATA ATCTTACCAG 1043 
CTTTGGGTAA ACTGTTT 1060 
(4) INFORMATION FOR SEQ ID NO: 3: ' 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 250 amino acid residues 
15 (B) TYPE : amino acid 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



10 



20 



25 



30 



35 



40 



Met Gly Arg Gly Arg Val Glu Leu Ly 3 Arg lie Glu Asn Lya He Asn 
5 10 15 

Arg Gin Val Thr Phe Ser Lys Arg Arg Asn Gly Leu Leu Lys Lva Ala 
20 25 30 

Tyr Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu He He Phe 
35 40 45 

Ser Lyg Ser Arg Gly Leu Tyr Glu Phe Gly Ser Ala Gly He Thr Lys 
50 55 so 

Thr Leu Glu Arg Tyr Gin His Cys Cys Tyr Asn Ala Gin Asp Ser Asn 
65 7 ° 75 80 

Asn Ala Leu Ser Glu Thr Gin Ser Trp Tyr His Glu Met Ser Lys Leu 
85 90 95 

Lys Ala Lys Phe Glu Ala Leu Gin Arg Thr Gin Arg His Leu Leu Gly 
100 105 xiO 

Glu Asp Leu Gly Pro Leu Ser Val Lys Glu Leu Gin Gin Leu Glu Lvs 
US 120 i2S 

Gin Leu Glu Cys Ala Leu Ser Gin Ala Arg Gin Arg Lys Thr Gin Leu 
"0 135 140 

Met Met Glu Gin Val Glu Glu Leu Arg Arg Lys Glu Arg Gin Leu Gly 
145 • "0 iss 160' 

Glu He Asn Arg Gin Leu Lys His Lys Leu Glu Val Glu Gly Ser Thr 
165 170 17s 

Ser Asn Tyr Arg Ala Met Gin Gin Ala Ser Trp Ala Gin Gly Ala Val 
180 185 190 

Val Glu Asn Gly Ala Ala Tyr Val Gin Pro Pro Pro His Ser Ala Ala 
195 200 205 

Met Asp Ser Glu Pro Thr Leu Gin He Gly Tyr Pro His Gin Phe Val 
210 215 . 220 

Pro Ala Glu Ala Asn Thr He Gin Arg Ser Thr Ala Pro Ala Gly Ala 
225 230 235 240 

Glu Asn Asn Phe Met Leu Gly Trp Val Leu 
245 250 

(5) INFORMATION FOR SEQ ID NO : 4 : . 
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(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1059 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS : double stranded 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

TATCCCCTTC CTCCAGGTGG CTTGTTTCTT GCAGTGGTGG TGGTGGTGGT GGTGAGATCT 60 
15 AGCTTGGTTG GTTGGTGGCA GCTGGAGATC GATCGGG 97 

ATG GGG AGO GGG CGG GTG GAG CTG AAG AGG ATC GAG AAC AAG ATC AAC 145 
Met Gly Arg Gly Arg Val Glu Leu Lys Arg He Glu Asn Lys He Asn 

20 

CGG AAG GTG ACG TTC GCC AAG AGG AGG AAT GGC CTG CTC AAG AAG GCG 193 
Arg Lys Val Thr Phe Ala Lys Arg Arg Asn Gly Leu Leu Lys Lys Ala 
20 25 30 

25 TAC GAG CTC TCC GTC CTC TGC GAC GCC GAG GTC GCC CTC ATC ATC TTC 241 
Tyr Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu He He Phe 
35 40 45 

TCC AAC CGC GGC AAG CTC TAC GAG TTC TGC AGC ACC CAG AGC ATG ACT 289 
30 Ser Asn Arg Gly Lys Leu Tyr Glu Phe Cys Ser Thr Gin Ser Met Thr 
50 55 60 

AAA ACG CTT GAG AAG TAT CAG AAA TGC AGT TAC GCA GGA CCC GAA ACA 337 
Lys Thr Leu Glu Lys Tyr Gin Lys Cys Ser Tyr Ala Gly Pro Glu Thr 
35 65 70 75 80 

GCT GTC CAA AAT AGA GAA AGT GAG CAA TTG AAA GCT AGC CGC AAT GAA 385 
Ala Val Gin Asn Arg Glu Ser Glu Gin Leu Lys Ala Ser Arg Asn Glu 
85 90 95 

40 

TAC CTC AAA CTG AAG GCA AGG GTT GAA AAT TTA CAA CGG ACT CAA AGA 433 
Tyr Leu Lys Leu Lys Ala Arg Val Glu Asn Leu Gin Arg Thr Gin Arg 
100 105 no 

45 AAT TTG CTG GGT CCA GAT CTT GAT TCA TTA GGC ATA AAA GAG CTC GAG 481 
Asn Leu Leu Gly Pro Asp Leu Asp Ser Leu Gly He Lys Glu Leu Glu 
115 120 125 

AGC CTA GAG AAG CAG CTT GAT TCA TCC CTG AAG CAC GTC AGA ACT ACA 529 
50 Ser Leu Glu Lys Gin Leu Asp Ser Ser Leu Lya His Val Arg Thr Thr 
130 135 140 

AGG ACA AAA CAT CTG GTC GAC CAA CTG ACG GAG CTT CAG AGA AAG GAA 577 
Arg Thr Ly3 His Leu Val Asp Gin Leu Thr Glu Leu Gin Arg Lys Glu 
5 5 145 150 155 160 



60 



CAA ATG GTT TCT GAA GCA AAT AGA TGC CTT AGG AGA AAA CTG GAG GAA 625 
Gin Met Val Ser Glu Ala Asn Arg Cys Leu Arg Arg Lys Leu Glu Glu 
165 170 175 

AGC AAC CAT GTT CGC GGG CAG CAA GTG TGG GAG CAG GGC TGC AAC TTA 673 
Ser Asn His Val Arg Gly Gin Gin Val Trp Glu Gin Gly Cys Asn Leu 
180 185 190 

65 ATT GGC TAT GAA CGT CAG CCT GAA GTG CAG CAG CCT CTT CAC GGC GGC 721 
He Gly Tyr Glu Arg Gin Pro Glu Val Gin Gin Pro Leu Hia Gly Gly 
195 200 205 



70 



AAT GGG TTC TTC CAT CCA CTT GAT GCT GCT GGT GAA CCC ACC CTT CAG 
Asn Gly Phe Phe His Pro Leu Asp Ala Ala Gly Glu Pro Thr Leu Gin 



769 
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20 



30 



35 



40 



45 



50 



55 



60 



65 



70 



210 215 



220 



AIT GGG TAG CCT GCA GAG CAT CAT GAG GCG ATG AAC AGT GCG TGC ATG 
lie Gly Tyr Pro Ala Glu His His Glu Ala Met Asn Ser Ala Cys Met 
225 230 235 240 

AAC ACC TAC ATG CCC CCA TGG CTA CCA TGA 
Asn Thr Tyr Met Pro Pro Trp Leu Pro 
245 



817 



847 



TGATGACGGG ACAATGAATT ACGAAATAAC AAGGATATGT GG CAT AT ATG TGGTGCCGCA 907 
TACATGCATG TATCATGGCT AGCTACTTAA TTGGAGTGAT GGATTTGAAC TAGTTTCGTA 967 
15 TGTAGCCTGT TTGTGTGTAA CTTGTGTGAG ATACTACCTT AAAAACTATC GGTGTCTGTT 1027 

GAACATATTC TGCGATCAAC TTTAAGCGTA TT 



10S9 

(6) INFORMATION FOR SEQ ID NO: 5: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 249 amino acid residues 
25 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



Met Gly Arg Gly Arg Val Glu Leu Lys Arg lie Glu Asn Lys lie Asn 
5 10 1S 

Arg Lys Val Thr Phe Ala Lys Arg Arg Asn Gly Leu Leu Lys Lys Ala 
20 25 30 

Tyr Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu lie He Phe 
35 40 45 

Ser Asn Arg Gly Lys Leu Tyr Glu Phe Cys Ser Thr Gin Ser Met Thr 
50 55 go 

Lys Thr Leu Glu Lys Tyr Gin Lys Cys Ser Tyr Ala Gly Pro Glu Thr 
65 70 75 80 

Ala Val Gin Asn Arg Glu Ser Glu Gin Leu Lys Ala Ser Arg Asn Glu 
85 go 95 

Tyr Leu Lys Leu Lys Ala Arg Val Glu Asn Leu Gin Arg Thr Gin Arq 
100 105 110 9 

Asn Leu Leu Gly Pro Asp Leu Asp Ser Leu Gly He Lys Glu Leu Glu 
115 120 125 

Ser Leu Glu Lys Gin Leu Asp Ser Ser Leu Lys His Val Arg Thr Thr 
130 135 140 . 

Arg Thr Lys His Leu Val Asp Gin Leu Thr Glu Leu Gin Arg Lys Glu 
145 150 155 160 

Gin Met Val Ser Glu Ala Asn Arg Cys Leu Arg Arg Lys Leu Glu Glu 
165 170 175 

Ser Asn His Val Arg Gly Gin Gin Val Trp Glu Gin Gly Cys Asn Leu 
180 185 190 

He Gly Tyr Glu Arg Gin Pro Glu Val Gin Gin Pro Leu His Gly Gly 
195 200 2 o5 

Asn Gly Phe Phe His Pro Leu Asp Ala Ala Gly Glu Pro Thr Leu Gin 
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210 215 220 

lie Gly Tyr Pro Ala Glu His His Glu Ala Met Asn Ser Ala Cys Met 
225 230 235 240 

Aan Thr Tyr Met Pro Pro Trp Leu Pro 
245 

(7) INFORMATION FOR SEQ ID NO: 6: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1180 base pairs 
15 (B) TYPE: nucleic acid 

(C) STRAND EDNESS : double stranded 

(D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

TGCTTTCCCC TCTCTTCCGC TTCGCGAGAT TGGTTGATTC ATCTCGCGAT TGATCGAGCT 60 
25 CGAGCGGCGG TGAGGTGAGG TGGAGGAGGA GGAGGAGGAG GAGATCGGG 109 



30 



50 



60 



ATG GGG AGA GGG AGG GTG GAG CTG AAG AGG ATC GAG AAC AAG ATC AAC 157 
Met Gly Arg Gly Arg Val Glu Leu Lys Arg lie Glu Asn Lys He Asn 
5 io is 

AGG CAG GTG ACG TTC GCG AAG CGG AGG AAT GGG CTG CTC AAG AAG GCG 205 
Arg Gin Val Thr Phe Ala Lys Arg Arg Asn Gly Leu Leu Lys Lys Ala 
20 25 30 



35 TAC GAG CTC TCC GTG CTC TGC GAC GCC GAG GTC GCC CTC ATC ATC TTC 253 

Tyr Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu He He Phe 
35 40 45 

TCC AAC CGC GGC AAG CTC TAC GAG TTC TGC AGC GGC CAA AGC ATG ACC 301 
40 Ser Asn Arg Gly Lys Leu Tyr Glu Phe Cys Ser Gly Gin Ser Met Thr 
50 55 so 

AGA ACT TTG GAA AGA TAC CAA AAA TTC AGT TAT GGT GGG CCA GAT ACT 349 
Arg Thr Leu Glu Arg Tyr Gin Lys Phe Ser Tyr Gly Gly Pro Asp Thr 
45 65 70 75 80 



GCA ATA CAG AAC AAG GAA AAT GAG TTA GTG CAA AGC AGC CGC AAT GAG 397 
Ala He -Gin Asn Lys Glu Asn Glu Leu Val Gin Ser Ser Arg Asn Glu 
85 90 95 

TAC CTC AAA CTG AAG GCA CGG GTG GAA AAT TTA CAG AGG ACC CAA AGG 445 
Tyr Leu Lys Leu Lys Ala Arg Val Glu Asn Leu Gin Arg Thr Gin Arg 
100 105 no 



55 aat err ctt ggt gaa gat ctt GGG ACA CTT GGC ATA AAA GAG CTA GAG 
Asn Leu Leu Gly Glu Asp Leu Gly Thr Leu Gly He Lys Glu Leu Glu 
115 120 125 



493 



CAG CTT GAG AAA CAA CTT GAT TCA TCC TTG AGG CAC ATT AGA TCC ACA 541 
Gin Leu Glu Lys Gin Leu Asp Ser Ser Leu Arg His He Arg Ser Thr 
130 135 140 

AGG ACA CAG CAT ATG GAT CAG CTC ACT GAT CTC CAG AGG AGG GAA 589 

Arg Thr Gin His Met Leu Asp Gin Leu Thr Asp Leu Gin Arg Arg Glu' 
65 "5 150 155 160 



70 



CAA ATG TTG TGT GAA GCA AAT AAG TGC CTC AGA AGA AAA CTG GAG GAG 637 
Gin Met Leu Cys Glu Ala Asn Lys Cys Leu Arg Arg Lys Leu Glu Glu 
165 170 175 
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AGC AAC CAG TTG CAT GGA CAA GTG TGG GAG CAC GGC GCC ACC CTA CTC 
Ser Asn Gin Leu His Gly Gin Val Trp Glu His Gly Ala Thr Leu Leu 
180 185 190 



685 



GGC TAG GAG CGG CAG TCG CCT CAT GCC GTC CAG CAG GTG CCA CCG CAC 733 
Gly Tyr Glu Arg Gin Ser Pro His Ala Val Gin Gin Val Pro Pro His 
195 200 205 

GGT GGC AAC GGA TTC TTC CAT TCC CTG GAA GCT GCC GCC GAG CCC ACC 781 
Gly Gly Asn Gly Phe Phe His Ser Leu Glu Ala Ala Ala Glu Pro Thr 
210 215 220 



TTG CAG ATC GGG TTT ACT CCA GAG CAG ATG AAC AAC TCA TGC GTG ACT 829 
Leu Gin lie Gly Phe Thr Pro Glu Gin Met Asn Asn Ser Cys Val Thr 
15 225 2 30 235 240 



GCC TTC ATG CCG ACA TGG CTA CCC TGA 
Ala Phe Met Pro Thr Trp Leu Pro 
245 



856 



ACTCCTGAAG GCCGATGCGA CAACCAATAA AAACGGATGT GACGACACAG ATCAAGTCGC 916 

ACCATTAGAT TGATCTTCTC CTACAAGAGT GAGACTAGTA ATTCCGCGTT TGTGTGCTAG 976 
25 CGTGTTGAAA CTTTTCTGAT GTGATGCACG CACTTTTAAT TATTATTAAG CGTTCAAGGA 1036 

CTAGTATGTG GTATAAAAGC CCGTACGTGA CAGCCTATGG TTATATGCTG CGCAAAAACT 1096 
^ ACGTATGGTA CAGTGCAGTG CCTGTACATT TCATAATTTG CGGGTAAAGT TTATTGACTA 1156 

TATATCCAGT GTGTCAAATA TAAT 

(8) INFORMATION FOR SBQ ID NO: 7: 
35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 248 amino acid residues 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



Met Gly Arg Gly Arg Val Glu Leu Lys Arg He Glu Asn Lys He Asn 
5 10 15 

Arg Gin Val Thr Phe Ala Lys Arg Arg Asn Gly Leu Leu Lys Lys Ala 
20 25 30 

Tyr Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu He He Phe 
35 40 45 

Ser Asn Arg Gly Lys Leu Tyr Glu Phe Cys Ser Gly Gin Ser Met Thr 
50 55 60 

Arg Thr Leu Glu Arg Tyr Gin Lys Phe Ser Tyr Gly Gly Pro Asp Thr 
65 ™ 75 80 

Ala He Gin Asn Lys Glu Asn Glu Leu Val Gin Ser Ser Arg Asn Glu 
85 90 95 

Tyr Leu Lys Leu Lys Ala Arg Val Glu Asn Leu Gin Arg Thr Gin Arg 
100 105 no 

Asn Leu Leu Gly Glu Asp Leu Gly Thr Leu Gly He Lys Glu Leu Glu 
US 120 125 

Gin Leu Glu Lys Gin Leu Asp Ser Ser Leu Arg His He Arg Ser Thr 
130 135 140 
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Arg Thr Gin His Met Leu Asp Gin Leu Thr Asp Leu Gin Arg Arg Glu 
145 150 155 160 

Gin Met Leu Cys Glu Ala Asn Lys Cys Leu Arg Arg Lys Leu Glu Glu 
165 170 175 

Ser Asn Gin Leu His Gly Gin Val Trp Glu Hia Gly Ala Thr Leu Leu 
180 185 190 

Gly Tyr Glu Arg Gin Ser Pro His Ala Val Gin Gin Val Pro Pro His 
195 200 205 

Gly Gly Asn Gly Phe Phe His Ser Leu Glu Ala Ala Ala Glu Pro Thr 
210 215 220 

Leu Gin He Gly Phe Thr Pro Glu Gin Met Asn Asn Ser Cys Val Thr 
225 230 235 240 



Ala Phe Met Pro Thr Trp Leu Pro 
20 24S 



(9) INFORMATION FOR SEQ ID NO: 8: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 56 amino acid residues 

(B) TYPE: amino acid 
30 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Gly Arg Gly Arg Val Glu Leu Lys Arg He Glu Asn Lys He Asn Arg 
35 5 io is 

Gin Val Thr Phe Ser Lys Arg Arg Asn Gly Leu Leu Lys Lys Ala Tyr 
20 25 30 



Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu He He Phe Ser 
35 40 45 

Ser Arg Gly Lys Leu Tyr Glu Phe 
50 55 



(10) INFORMATION FOR SEQ ID NO: 9: 
(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 56 amino acid residues 

(B) -TYPE: amino acid 
(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



Gly Arg Gly Arg Val Glu Leu Lys Arg He Glu Asn Lys He Asn Arg 
5 10 is 

Gin Val Thr Phe Ser Lys Arg Arg Asn Gly Leu Leu Lys Lys Ala Tyr 
20 25 30 



Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu He He Phe Ser 
05 35 40 45 

Gly Arg Gly Lys Leu Tyr Glu Phe 
50 55 

70 (11) INFORMATION FOR SEQ ID NO: 10: 



10 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 56 amino acid residues 
> (B) TYPE: amino acid 

<D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Gly Arg Gly Arg Val Glu Met Lys Arg He Glu Asn Lys lie Asn Arg 
5 10 i5 

Gin Val Thr Phe Ser Lys Arg Arg Asn Gly Leu Leu Lys Lys Ala Tyr 



15 20 25 



20 



30 



35 



40 



45 



50 



60 



65 



70 



30 



Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu He lie Phe Ser 
3S 40 4S 

Ser Arg Gly Lys Leu Tyr Glu Phe 
50 55 



(12) INFORMATION FOR SEQ ID NO: 11 : 
25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 56 amino acid residues 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



Gly Arg Gly Arg Val Glu Leu Lys Arg lie Glu Asn Lys lie Asn Arg 
5 10 1S 

Gin val Thr Phe Ala Lys Arg Arg Asn Gly Leu Leu Lys Lys Ala Tyr 
20 25 3 0 

Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu lie lie Phe Ser 
35 40 45 

Asn Arg Gly Lys Leu Tyr Glu Phe 
50 55 



(13) INFORMATION FOR SEQ ID NO: 12: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 56 amino acid residues 

(B) TYPE: amino acid 
55 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



Gly Arg Gly Arg Val Glu Leu Lys Arg lie Glu Gly Lys He Asn Arg 
5 10 15 

Gin Val Thr Phe Ala Lys Arg Arg Asn Gly Leu Leu Lys Lys Ala Tyr 
20 25 30 

Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu lie He Phe Ser 
35 40 4S 

Asn Arg Gly Lys Leu Tyr Glu Phe 
50 55 
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(14) INFORMATION FOR SEQ ID NO: 13: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 56 amino acid residues 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Gly Arg Gly Arg Val Glu Leu Lys Met He Glu Aan Lys He Asn Arg 
5 10 15 

Gin Val Thr Phe Ala Lya Arg Arg Lys Arg Leu Leu Lys Lys Ala Tyr 
20 25 30 



Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu He He Phe Ser 
20 35 4 0 45 

Asn Arg Gly Lys Leu Tyr Glu Phe 
SO 55 

25 (15) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 56 amino acid residues 

30 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 



Gly Arg Gly Arg Val Glu Leu Lys Arg He Glu Asn Lys He Asn Arg 
5 10 1S 

Gin Val Thr Phe Ala Lys Arg Arg Asn Gly Leu Leu Lys Lys Ala Tyr 
20 25 30 

Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu He He Phe Ser 
35 40 45 

Asn Arg Gly Lys Leu Tyr Glu Phe 
50 55 



(16) INFORMATION FOR SEQ ID NO: 15: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 56 amino acid residues 
55 (B) TYPE: amino acid 

<D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO-15- 

60 



Gly Arg Gly Arg Val Glu Leu Lys Arg He Glu Asn Lys He Asn Arg 
5 10 is 

Gin Val Thr Phe Ala Lys Arg Arg Asn Gly Leu Leu Lys Lys Ala Tyr 
20 25 30 

Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ser Leu He Val Phe Ser 
35 40 45 

70 Asn Arg Gly Lys Leu Tyr Glu Phe 
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SO 55 
(17) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 9 
<A) LENGTH: 56 amino acid residues 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Gly Arg Gly Arg Val Gin Leu Lys Arg lie Glu Asn Lys He Asn Arg 
5 10 is 

Gin Val Thr Phe Ser Lys Arg Arg Ala Gly Leu Leu Lys Lys Ala His 
20 25 30 

Glu He Ser Val Leu Cys Asp Ala Glu Val Ala Leu Val Val Phe Ser 
35 40 45 



His Lys Gly Lys Leu Phe Glu Tyr 
25 50 55 



(18) INFORMATION FOR SEQ ID NO: 17: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH; 56 amino acid residues 

(B) TYPE; amino acid 
35 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 17: 



Gly Arg Gly Lys He Glu He Lys Arg He Glu Asn Thr Thr Asn Arg 
5 10 is 

Gin Val Thr Phe Cys Lys Arg Arg Asn Gly Leu Leu Lys Lys Ala Tyr 
20 25 30 

Glu Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu He Val Phe Ser 
3S 40 4S 

Ser Arg Gly Arg Leu Tyr Glu Tyr 
50 55 

(19) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 56 amino acid residues 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Ala Arg Gly Lys He Gin He Lys Arg He Glu Asn Gin Thr Asn Arg 
5 10 15 

Gin Val Thr Tyr Ser Lys Arg Arg Asn Gly Leu Phe Lys Lys Ala His 
20 25 30 

Glu Leu Thr Val Leu Cys Asp Ala Arg Val Ser He He Met Phe Ser 
35 40 4S 
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Ser Ser Asn Lye Leu His Glu Tyr 
SO 55 

(20) INFORMATION FOR SEQ ID NO: 19: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 56 amino acid residues 
10 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 



15 



Gly Arg Gly Lys lie Glu He Lys Arg He Glu Asn Ala Asn Asn Arg 
5 10 X5 



Val Vai Thr Phe Ser Lys Arg Arg Asn Gly Leu Val Lys Lys Ala Lys 
20 20 25 30 

Glu He Thr Val Leu Cys Asp Ala Lys Val Ala Leu He He Phe Ala 
35 40 45 

25 Ser Asn Gly Lys Met He Asp Tyr 
50 55 

(21) INFORMATION FOR SEQ ID NO: 20; 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 65 amino acid residues 

(B) TYPE: amino acid 

35 

(D) TOPOLOGY: linear 

. (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

40 Gin Glu Met Ser Lys Leu Arg Ala Lys Phe Glu Ala Leu Gin Arg Thr 

5 10 15 

Gin Arg His Leu Leu Gly Glu Glu Leu Gly Pro Leu Ser Val Lys Glu 

45 

Leu Gin Gin Leu Glu Lys Gin Leu Glu Cys Ala Leu Ser Gin Ala Arg 
35 40 45 

Gin Arg Lys Thr Gin Leu Met Met Glu Gin Val Glu Glu Leu Arq Arq 
50 50 55 6 o 

Lys 
65 

55 (22) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 65 amino acid residues 

60 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

65 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 



70 



Gin Glu Met Ser Lys Leu Arg Ala Lys Phe Glu Ala Leu Gin Arg Thr 
Gin Arg His Leu Leu Gly Glu Asp Leu Gly Pro Leu Ser Val Lys Glu 
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Leu Gin Gin Leu Glu Lys Gin Leu Glu Cya Ala Leu Ser Gin Ala Arg 
35 40 45 

Gin Arg Lys Thr Gin Val Met Met Glu Gin Val Glu Glu Leu Arc? Ara 
50 55 60 



Thr 
10 65 



(23) INFORMATION FOR SEQ ID NO: 22: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 65 amino acid residues 

(B) TYPE: amino acid 
20 (D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 



Gin Glu Val Thr Lys Leu Lys Ser Lys Tyr Glu Ser Leu Val Arg Thr 
5 10 15 

Asn Arg Asn Leu Leu Gly Glu Asp Leu Gly Glu Met Gly Val Lys Glu 
20 25 30 

Leu Gin Ala Leu Glu Arg Gin Leu Glu Ala Ala Leu Thr Ala Thr Arg 
35 40 43 

Gin Arg Lys Thr Gin Val Met Met Glu Glu Met Glu Asp Leu Arg Lys 
50 55 60 

Lys 
65 



(24) INFORMATION FOR SEQ ID NO: 23: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 65 amino acid residues 
45 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 



Gin Glu Tyr Leu Lys Leu Lys Ala Arg Tyr Glu Ala Leu Gin Arg Ser 
5 10 15 

Gin Arg Asn Leu Leu Gly Glu Asp Leu Gly Pro Leu Asn Ser Lys Glu 
20 25 30 

Leu Glu Ser Leu Glu Arg Gin Leu Asp Met Ser Leu Lys Gin He Arg 
35 40 4S 

Ser Thr Arg Thr Gin Leu Met Leu Asp Gin Leu Gin Asp Leu Gin Arq 
S ° 55 60 

Lys 
65 

(25) INFORMATION FOR SEQ ID NO: 24: 
(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 65 amino acid residues 
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(B) TYPE : amino acid 
(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Gin Glu Tyr Leu Lys Leu Lys Gly Arg Tyr Glu Ala Leu Gin Arg Ser 
5 10 is 

Gin Arg Asn Leu Leu Gly Glu Asp Leu Gly Pro Leu Asn Ser Lys Glu 
20 25 30 

Leu Glu Ser Leu Glu Arg Gin Leu Asp Mec Ser Leu Lys Gin lie Arg 
35 40 45 

Ser Thr Arg Thr Gin Leu Met Leu Asp Gin Leu Thr Asp Tyr Gin Arg 
50 55 60 



Lys 
20 65 



(26) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS : 

i 

(A) LENGTH: 65 amino acid residues 

(B) TYPE: amino acid 
30 (D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

Gin Glu Tyr Leu Lys Leu Lys Asn Arg Val Glu Ala Leu Gin Arg Ser 
35 5 io 15 

Gin Arg Asn Leu Leu Gly Glu Asp Leu Gly Pro Leu Gly Ser Lys Glu 
20 25 30 

40 Leu Glu Gin Leu Glu Arg Gin Leu Asp Ser Ser Leu Arg Gin He Arg 
35 40 45 

Ser Thr Arg Thr Gin Phe Met Leu Asp Gin Leu Ala Asp Leu Gin Arg 
50 55 60 

Arg 
65 

(27) INFORMATION FOR SEQ ID NO: 26: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 65 amino acid residues 
55 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 



Arg Glu Tyr Leu Lys Leu Lys Gly Arg Tyr Glu Asn Leu Gin Arg Gin 
5 10 is 

Gin Arg Asn Leu Leu Gly Glu Asp Leu Gly Pro Leu Asn Ser Lys Glu 
20 25 30 

Leu Glu Gin Leu Glu Arg Gin Leu Asp Gly Ser Leu Lys Gin Val Arg 
35 40 45 

Ser He Lys Thr Gin Tyr Met Leu Asp Gin Leu Ser Asp Leu Gin Asn 
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15 



20 



Lys 
65 



(28) INFORMATION FOR SEQ ID NO: 27: 
(i) SEQUENCE CHARACTERISTICS : 
10 (A) LENGTH: 65 amino acid residues 

(B) TYPE: amino acid 
<D) TOPOLOGY: linear 
(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 27: 



Arg Glu Tyr Leu Lys Leu Lys Gly Arg Tyr Glu Asn Leu Gin Arg Gin 
5 10 is 

Gin Arg Asn Leu Leu Gly Glu Asp Leu Gly Pro Leu Asn Ser Lys Glu 
20 25 30 



Leu Glu Gin Leu Glu Arg Gin Leu Asp Gly Ser Leu Lys Gin Val Arg 
25 35 40 45 

Cys He Lys Thr Gin Tyr Met Leu Asp Gin Leu Ser Asp Leu Gin Gly 
50 55 go 

30 Lys 
£5 

(29) INFORMATION FOR SEQ ID NO: 28: 
35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 65 amino acid residues 

(B) TYPE: amino acid 
<D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 



Met Glu Tyr Asn Arg Leu Lys Ala Lys He Glu Leu Leu Glu Arg Asn 
5 10 15 

Gin Arg His Tyr Leu Gly Glu Asp Leu Gin Ala Met Ser Pro Ly3 Glu 
20 25 30 

Leu Gin Asn Leu Glu Gin Gin Leu Asp Thr Ala Leu Lys His He Arg 
35 40 45 

Thr Arg Lys Asn Gin Leu Met Tyr Glu Ser He Asn Glu Leu Gin Lys 
50 55 60 

Lys 
65 

(3 0) INFORMATION FOR SEQ ID NO: 29: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 65 amino acid residues 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
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Gln Glu Ser Ala Lys Leu Arg Gin Gin He He Ser He Gin Asn Ser 
5 10 15 

Asn Arg Gin Leu Met Gly Glu Thr He Gly Ser Met Ser Pro Lys Glu 
5 20 25 30 

Leu Arg Asn Leu Glu Gly Arg Leu Glu Arg Ser He Thr Arg lie Arg 
35 40 45 

10 Ser Lys Lys Asn Glu Leu Leu Phe Ser Glu He Asp Tyr Met Gin Lys 
SO S5 60 



15 



25 



30 



Arg 
65 



(31) INFORMATION FOR SSQ ID NO: 30: 
(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 66 amino acid residues 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 



Gin Glu Thr Lys Arg Lys Leu Leu Glu Thr Asn Arg Asn Leu Arg Thr 
5 10 is 

Gin He Lys Gin Arg Leu Gly Glu Cys Leu Asp Glu Leu Asp He Gin 
20 25 30 



Glu Leu Arg Arg Leu Glu A3p Glu Met Glu Asn Thr Phe Lys Leu Val 
35 35 40 45 

Arg Glu Arg Lys Phe Lys Ser Leu Gly Asn Gin He Glu Thr Thr Lys 
50 55 so 

40 Lys Lys 
65 

(32) INFORMATION FOR SEQ ID NO: 31: 
45 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 61 amino acid residues 

(B) TYPE : amino acid 

50 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 31: 

55 Asn Glu He Asp Arg He Lys Lys Glu Asn Asp Ser Leu Gin Leu Glu 

5 10-15 

Leu Arg His Leu Lys Gly Glu Asp He Gin Ser Leu Asn Leu Lys Asn 

60 

Leu Met Ala Val Glu His Ala He Glu His Gly Leu Asp Lys Val Arg 
35 40 45 

Asp His Gin Met Glu He Leu He Ser Lys Arg Arg Asn 
65 50 SS 60 

(3 3) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 39 amino acid residues 

(B) TYPE: amino acid 
5 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Glu Pro Thr Leu Gin He Gly Tyr Pro His His Gin Phe Pro Pro Pro 
10 5 io 15 

Glu Ala Val Asn Asn He Pro Arg Ser Ala Ala Thr Gly Glu Asn Asn 
20 25 30 

15 Phe Met Leu Gly Trp Val Leu 
35 

(34) INFORMATION FOR SEQ ID NO: 33: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 amino acid residues 
CB) TYPE: amino acid 
<D) TOPOLOGY: linear 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33; 

Glu Pro Thr Leu Gin He Gly Tyr Pro Pro His His Gin Phe Leu Pro 
S io 1S 

Ser Glu Ala Ala Asn Asn He Pro Arg Ser Pro Pro Gly Gly Glu Asn 
20 25 30 

Asn Phe Met Leu Gly Trp Val Leu 
35 40 

(35) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 amino acid residues 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 34: 

Glu Pro Phe Leu Gin He Gly Phe Gly Gin His Tyr Tyr Val Gly Gly 
5 10 15 

Glu Gly Ser Ser Val Ser Lys Ser Asn Val Ala Gly Glu Thr Asn Phe 
20 2S 30 

Val Gin Gly Trp Val Leu 
35 

(36) INFORMATION FOR SEQ ID NO: 35: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acid residues 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: 
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10 



GXu Pro Thr Leu Gin He Gly Tyr Gin Asn Asp Pro He Thr Val Gly 
5 10 is 

Gly Ala Gly Pro Ser Val Asn Asn Tyr Met Ala Gly Trp Leu Pro 
20 25 30 

(37) INFORMATION POR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acid residues 

(B) TYPE: amino acid 
15 (D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

Glu Pro Thr Leu Gin He Gly Tyr Gin Asn Asp Pro He Thr Val Gly 
20 5 io 15 

Gly Ala Gly Pro Ser Val Asn Asn Tyr Met Ala Gly Trp Leu Pro 
20 2S 30 

25 (38) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 32 amino acid residues 

30 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 



40 



60 



65 



Glu Pro Thr Leu Gin He Gly Tyr His Ser Asp He Thr Met Ala Thr 

5 10 is 

Ala Thr Ala Ser Thr Val Asn Asn Tyr Met Pro Pro Gly Trp Leu Gly 
20 25 30 



(3 9) INFORMATION FOR SEQ ID NO: 38: 
45 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 37 amino acid residues 

(B) TYPE: amino acid 

50 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

55 Asn Pro Thr Leu Gin Met Gly Tyx Asp Asn Pro Val Cya Ser Glu Gin 

5 10.15 

He Thr Ala Thr Thr Gin Ala Gin Ala Gin Pro Gly Asn Gly Tyr lie 
20 25 30 

Pro Gly Trp Mex Leu 
35 

(40) INFORMATION FOR SEQ ID NO: 39: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 amino acid residues 
70 (B) TYPE: amino acid 
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<D> TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

Asp Pro Thr Leu Gin He Gly Tyr Ser Hia Pro Val Cys Ser Glu Gin 
5 10 15 

Met Ala Val Thr Val Gin Gly Gin Ser Gin Gin Gly Asn Gly Tyr He 
20 25 30 

Pro Gly Trp Met Leu 
35 

(41) INFORMATION FOR SEQ ID NO: 40: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 39 amino acid residues 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

Ser Pro Phe Leu Asn Met Gly Gly Leu Tyr Gin Glu Asp Asp Pro Met 
5 10 15 

Ala Met Arg Asn Asp Leu Glu Leu Thr Leu Glu Pro Val Tyr Asn Cys 
20 25 30 

Asn Leu Gly Cys Phe Ala Ala 
35 

(42) INFORMATION FOR SEQ ID NO: 41: 
(ii SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 25 amino acid residues 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

Val Ala Ala Leu Gin Pro Asn Asn His His Tyr Ser Ser Ala Gly Arg 
5 10 15 

Gin Asp Gin Thr Ala Leu Gin Leu Val 
20 25 

(43) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH:. 29 amino acid residues 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

Phe His Gin Asn His His His Tyr Tyr Pro Asn His Gly Leu His Ala 
5 10 IS 

Pro Ser Ala Ser Asp He He Thr Phe His Leu Leu Glu 
20 25 
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15 



(44) INFORMATION FOR SEQ ID NO: 43: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acid residues 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

Val Ala Ala Leu Gin Pro Aan Leu Gin Glu Lys He Met Ser Leu Val 
5 10 is 

lie Asp 
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WHAT IS n.AtMFn Jfi. 

1. An isolated nucleic acid comprising a nucleotide sequence selected from a group 
consisting of SEQ. ID NO:2, SEQ. ID N0:4, and SEQ. ID NO:6. 

2. The nucleic acid of claim 1 that, when expressed in a transgenic plant containing said 
nucleic acid, causes the transgenic plant to exhibit at least one phenotype selected from the group 
consisting of: (i) diniinished apical dominance, (ii) early flowering, (iii) altered daylength 
requirement for flowering; (iv) greater synchronization of flowering; and (v) relaxed vernalization 
requirement, compared to a nontransgenic control plant. 

3. The nucleic acid of claim 2 wherein expression of the nucleic acid in the transgenic 
plant causes the transgenic plant to exhibit diminished apical dominance and early flowering 
compared to the nontransgenic control plant. 



4. An isolated nucleic acid comprising at least 30 consecutive nucleotides of SEQ ID 

NO:2. 



5. The nucleic acid of claim 4 comprising a portion of SEQ ID NO:2, wherein the 
portion being selected from the group consisting of a protein-coding region, a MADS-box region, 
and a K-box region. 

6. An isolated nucleic acid comprising at least 100 consecutive nucleotides having at 
least 70% nucleotide sequence similarity with a nucleotide sequence of SEQ ID NO:2, not 
including MADS-box and K-box regions thereof. 

7. The nucleic acid of claim 6. wherein expression of the nucleic acid in a transgenic 
plant causes the transgenic plant to exhibit at least one phenotype in the transgenic plant selected 
from the group consisting of: (i) dirninished apical dominance, (ii) early flowering, (iii) altered 
daylength requirement for flowering; (iv) greater synchronization of flowering; and (v) relaxed 
vernalization requirement compared to a nontransgenic control plant. 

8. The nucleic acid of claim 7 wherein expression of the nucleic acid in the transgenic 
plant causes the transgenic plant to exhibit diminished apical dominance and early flowering 
compared to the nontransgenic control plant. 
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9. The nucleic acid of claim 7 comprising at least 100 consecutive nucleotides that 
comprises only silent or conservative substitutions to the nucleotide sequence of SEQ ID NO: 2. 

10. A transgenic plant comprising the nucleic acid of claim 1. 



11. A transgenic plant comprising the nucleic acid of claim 4. 

12. A transgenic plant comprising the nucleic acid of claim 6. 



13. A method for producing a transgenic plant, comprising the steps of: 

(a) providing a nucleic acid as recited in claim 1; 

(b) introducing the nucleic acid of step (a) into a plant cell, thereby producing a 
transformed plant cell; and 

(c) regenerating from the transformed plant cell a transgenic plant comprising the nucleic 

acid. 



14. A method for producing a transgenic plant, comprising the steps of: 

(a) providing a nucleic acid as recited in claim 4; 

(b) introducing the nucleic acid of step (a) into a plant cell, thereby producing a 
transformed plant cell; and 

(c) regenerating from the transformed plant cell a transgenic plant comprising the nucleic 

acid. 

15. A method for producing a transgenic plant, comprising the steps of: 

(a) providing a nucleic acid as recited in claim 6; 

(b) introducing the nucleic acid of step (a) into a plant cell, thereby producing a 
transformed plant cell; and 

(c) regenerating from the transformed plant cell a transgenic plant comprising the nucleic 

acid. 
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FIG. 1 

1 TACCCGCGGGAATCGTTCGATCGATCGGGCGAGATG GGGAGGGGAAGAGTTGAGCTOAAfl 

MGRGRVELK 



6l CGCATCGAGMCAAGATCAArA^ CAGGTCArPTTrTCCAAGCGTCfy'AACGGCCTrrTr 

R I E N K I N R Q V T P S K R R N G L L 29 

121 AimGGCCTAC GAGCTGTCCGTTrTC^ 

KKAYELSVLCDAEVALIIFS 49 
181 AGCCGCGGCAAGCTCTACGAGTTranPAr,rr,rrnr.raTa ar a a a^HTOCAMCCTAC 

KSRGLYEFGSAGITKTLERY 69 

241 CMCATTGTTGCTACMTGCTCMGATTCCMCMTGCACTTTCTGAAACTCAGAGTTGG 

QHCCYNAQDSNNALS. ETQSW 89 

301 TACCATGAMTGTCAAAGTTflAA AGCAAMTTTffAAGCTTTGrAnrG CACTCAAAryy'Ar 

Y H E M S K L K A K F E A L Q R T Q R H 109 

361 ^GCTTG^GAGX7ATCTTGGACCACTrAGrGTrAAAaAAT^r ; cAGrAGrTf?flAfiAAAr i Ar: K " box 
LLGEDLGPLSVKELQQLEKQ 129 

421 CTTGMTGTGCACTATCACAGGCGAGACAGAGAAAnArflrA A CTGATGAT(^AArA(y;'pr; 

LECALS QARQRKTQLMMEQV 149 
481 GAGGMCTTCGCAGAAAGGAftrCTPAnrT^^^ . 

EELRRKE RQLGEINRQLKHK 169 
541 CTCGAGGTTGAAGGTTCCACCAGCAACTACAGAGCCATGCAGCAAGCCTCCTGGGCTCAG 

LEVEGS TSNYRAMQQASWAQ 189 
601 GGCGCCGTGGTGGAGAATGGCGCCGCATACGTGCAGCCGCCGCCACACTCCGCGGCCATG 

GAVVENGAAYVQPPPHSAAM 209 
661 GACTCTGMCCCACCTTGCAAATTGGGTATCCTCATCMTTTGTGCCTGCTGAAGCAAAC 

DSEPTLQIGYPHQFV PAEAN 
721 ACTATTCAGAGGAGCACTGCCC^ggA^TGCAGAGMCMCTTCATGCTGGGATGGGTT 

TIQRST APAGAENNFMLGWV 

781 CTTTGAGCTAAGCAGCCATCGATCAGCTGTCAGAAGTTGGAGCTAATAATAAAAGGGATG 
L * 

8 4 1 TGGAGTGGGCTACATGTATCTCGGATCTCTCTGCGAGCCACCTMTGGTCTTGCGTGGCC 
901 CTTTMTCTGTATGTTTTTGTGTGTMGCTACTGCTAGCTGTTTGCACCTTCTGCGTCCG 

9 6 1 TGGmTGTTTCCGTGCTACCTTTTTATGTTTTGATTTGGATCTTGmGAAAATAATCT 
1021 TACCAGCTTTGGGTAAACTGTTT ( A) n 
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FIG. 2 

1 TATCCCCTTCCTCCAGGTGGCTTGTTTCTTGCAGTGGTGGTGGTGGTGGTGGTGAGATCT 
61 flGTTTGf;TT(^TTGGTGGCAGCTGGAGATCGATCGGGATG GGGAGGGGGCGGGTGGAGCT . 

MGRGRVEL 8 

MADS-box 

121 GAAGAGGATCGAGAACAAGATCAACCGGAAGGTGACGTTCGCCAAGAGGAGGAATGGCCT 

KRIENKINRKVTFAKRRNGL 28 

181 f^TTAAGMGGCGTACGAGCTCTCCGTCCTCTGCGACGCCGAGGTCGCCCTCATCATCTT 

LKKAYELS VLCDAEVALIIF 48 

241 rTrTAACCGCGGCAAGCTCTACGAGTTCT GCAGCACCCAGAGCATGACTAAAACGCTTGA 

SNRGKLYE FCSTQSMT KTLE 68 

3 0 1 GAAGTATCAGAAATGCAGTTACGCAGGACCCGAAACAGCTGTCCAAAATAGAGAAAGTGA 

KYQKCS YAG PETAVQNRESE 88 

361 nrAATTnAMGCTAGCCGCMTGMTACCTCAMCTGAAGGCAAGGGTTGAAAATTTACA 

Q LKASRNE YLKLKARVENLQ 108 

K BOX 

421 ArGGArTrAMGAMTTTGCTGGGTCCAGATCTTGA TTCATTAGGCATAAAAGAGCTCGA 

RTQRNLLG PDLDSLGIKELE 128 

481 GAGCCTAGAGAAGCAGCTTGATTCATCCCTGAAGCACGTCAGAACTACAAGGACAAAACA 

S LEKQ LDSSLKHVRTTRTKH 148 

541 TCTGGTCGACCAACTGACGGAGCTTCAGAGAAAG GAACAAATGGTTTCTGAAGCAAATAG 

LVDQLTE LQRKE QMVS EANR 168 

601 ATGCCTTAGGAGAAAACTGGAGGAAAGCAACCATGTTCGCGGGCAGCAAGTGTGGGAGCA 

CLRRKLEESNHVRGQQVWEQ 188 

661 GGGCTGCMCTTMTTGGCTATGMCGTCAGCCTGMGTGCAGCAGCCTCTTCACGGCGG 

GCNL IGYE RQPEVQ Q PLHG G 208 

721 CMTGGGTTCTTCCATCCACTTGATGCTGCTGGTGMCCCACCCTTCAGATTGGGTACCC 

NG FF HP L DAAGE PT LQI G YP 228 

781 TGCAGAGCATCATGAGGCGATGAACAGTGCGTGCATGAACACCTACATGCCCCCATGGCT 

"X~E HHEAM NSACMNTYMP PWL 248 

841 ACCATGATGATGACGGGACAATGAATTACGAAATAACAAGGATATGTGGCATATATGTGG 

P * 249 

901 TGCCGCATACATGCATGTATCATGGCTAGCTACTTMTTGGAGTGATGGATTTGAACTAG 
961 TTTCGTATGTAGCCTGTTTGTGTGTMCTTGTGTGAGATACTACCTTAAAAACTATCGGT 
1021 GTCTGTTGAACATATTCTGCGATCAACTTTAAGCGTATT (A) n 
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FIG. 3 

1 TGCTTTCCCCTCTCTTCCGCTTCGCGAGATTGGTTGATTCATCTCGCGATTGATCGAGCT 
6 1 CGAGCGGCGGTGAGGTGAGGTGGAGGAGGAGGAGGAGGAGGAGATCGGGATG GGGAGAGG 

M G R G - 4. 



9 0 1 ACACAGATCAAGTCGCACCATTAGATTGATCTTCTCCTACAAGAGTGAGACTAGTAATTC 

9 6 1 CGCGTTTGTGTQC^G^GTGTTGAAACTTTTCTGATGTGATGCACGCACTTTTMTTATT 

1021 ATTAAGCGTTCAAGGACTAGTATGTGGTATAAAAGCCCGTACGTGACAGCCTATGGTTAT 

1081 ATGCTGCGCAAAMCTACGTATGGTACAGTGCAGTGCCTGTACATTTCATMTTTGCGGG 

1141 TAAAGTTTATTGACTATATATCCAGTGTGTCAAATATAAT (A) n 



MADS-box 
24 



64 



121 GAGGGTGGAGCTGAAGAGGATrnA GAACAAGATrAArAGGCAGGTGAPflTTCGCGAAGrG 
RVELKR I ENKINRQVTFA K R 

181 GACmTGGGC TGCTCMGAAC^TG^^ 

RNGLL KKAYELSVLCDAEVA 44 
241 CCTCATCATCTTCTCCAACCGrnnr AAGCTCTAfnAnTTr Tfy a^rrwra a anna^r;^ 

LIIFSNRGKLYEFCSGQSMT 

3 0 1 CAGAACTTTGGAAAGATACCAAAAATTCAGTTATGGTGGGCCAGATACTGCAATACAGAA 

RTLE RY Q KF SYGGPDTA I QN 84 

361 CAAGGAAAATGAGTTAGTGCAAAGCAGCCGC AATGAGTACCTCAAArT GAAGGrArnffaT 

K E N E L V- Q S S R N E Y L K L K A R V 104 

421 «H^mACM A«aCCCAAA^ K "box 

ENLQRTQRNLLGEDLGTLGI 124 

461 AAAAGAGCTAGAGCAGCTTGAflA A ACAACTTGATTr ATCCTTGACnT ACATTAGATTr Ar 

K E L E Q L E K Q L D S S L R H I R S T 144 

541 AAGGACACAGCATATGCTTGATCAGCTCACTGATCTCCAGAGGAGGG AACAAATffTTCTf; 

RTQHMLDQLTDLQRREQMLC 16 4 

601 TGAAGCAAATAAGTGCCTCAGAAGAAAACTGGAGGAGAGCAACCAGTTGCATGGACAAGT 

EANKCLRRKLEESNQLHGQV 184 

6 6 1 GTGGGAGCACGGCGCCACCCTACTCGGCTACGAGCGGCAGTCGCCTCATGCCGTCCAGCA 

WEHGATLLGYERQS P HAVQQ 204 

721 GGTGCCACCGCACGGTGGCMCGGATTCTTCCATTCCCTGGAAGCTGCCGCCGAGCCCAC 

VPPHGGNGFFHSLEAAAEPT 224 

781 CTTGCAGATCGGGTTTACTCCAGAGCAGATGAACAACTCATGCGTGACTGCCTTCATGCC 

LQIGF TP EQMNNS'CVTA FMP 244 

841 GACATGGCTACCCTGAACTCCTGAAGGCCGATGCGACAACCAATAAAAACGGATGTGACG 
T W L P * 



248 
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